Ask HN: Concept Taxonomy Tools?

4 points by mud_dauber 1 year ago | 3 comments
I've got a Pocket bookmark repo with ~15K articles in it. I'm reasonably disciplined about tagging each saved article, but am swimming in tags (~1200). I'm attempting to organize the tags offline using pencil & paper (doing it in a code editor isn't cutting it.). Pocket attempts to streamline things but their UI leaves a lot to be desired - and is platform dependent.

It's a laborious but interesting exercise - the paper taxonomy looks like it will have 50-100 top-level topics with 10-30 subtopics under each.

- Some are concatenations (jobdefs-career-work, fun-humor). - Some have multiple meanings (attention/behavior & attention/LLMs). - Some are being renamed as I go. (chat -> chatgpt)

Once complete I'll try matching this list with another pulled from my personal Jekyll site.

I looked at 1-2 online taxonomy tools before embarking on this sacred quest. Does this community have a favorite tool, for this type of job-to-be-done?

  • CrypticShift 1 year ago
    Did you try ChatGPT? It may also assist you in the matching.

    As a visual helper, you could also chart the 1K tags as embeddings in a 2D graph [1]

    My setup when manually working on this is a split-screen outliner (filter, drag&drop...).

    Once the taxonomy is done, I assume you will not be able to reflect those levels back in pocket ? I just see a flat list of tags there.

    [1] https://cookbook.openai.com/examples/visualizing_embeddings_...

    • mud_dauber 1 year ago
      I’ve used ChatGPT to create related concept lists. I haven’t settled on prompting strategy for the Pocket corpus (yet). The OAI recipe looks interesting, TY!

      And you’re right - I was using abbreviations, parent/child nomenclature, etc to hack Pocket’s tag structure. Yuck.

      • CrypticShift 1 year ago
        Make sure to use GPT-4 (not 3.5).

        I would say 10k/1k+ is too much for pocket. I prefer bookmark managers that have a dual folder/tag system. I try to take advantage of this duality in my taxonomy choices: hierarchical folders are at the base, with tags as more "lateral" descriptors.