Nidus Systems
i-Vocab Metafile Specification Proposal

The i-Vocab file hierarchy is an idiosyncratic vocabulary stream for authors over time which attempts to be a "fingerprint" of the author, both by individual author, workpiece, and date. First pass is to use common OCR engine and generate vocabulary file for author, article/book/thesis, noting date of creation. This generates i-Vocab base dict, which is not changed. Best guess is to use words which are at .001 level? So Maslow's base would contain words such as "isomorphic," "Weltanschauung," "aggridant," "eupsychia," etc.

YYYYMMDD.iv0 (zero) is base file (proposed) and successive gathers are YYYYMMDD.ivo (letter 'o'). The .ivo files can be gathered by month, year and day for ease of historical processing. Dated files with no content are deleted before writing the monthly and yearly files, and monthly files without content are also deleted. Native UNIX utilities are used to build repository structure. Key to making it work is consistent upkeep and discrete stacks by not only author but author-and-written-text. This way, new words like "orthomentor," "eularchy," and "Inodalyn" can be used to map sociograms over time, etc.

The base idiosyncratic vocabulary metafile is used for B-analysis, Factor Analytic analysis, and Petri-style addition to growing tips, as well as linking to it's own unique protocol. The protocol allows for realtime, Internet-wide plug-in functionality. The author's iVocab Metafile can be attached to a process and vi sockets, etc.

The i-Vocab base file produces a planar map of uniqueness and frequency, etc., and can be used as a blueprint for the author. This could predict future joinings to other authors and could provide for realtime analysis and display of linguistic tropisms as they develop.

Will post i-Vocab metafile for Abraham Maslow's paper "Theory Z", his "Eupsychian Management" book, and the master index at the University of Akron as time permits. As words are added by scanning future works, they will contribute to the base file backchaining.

There will need to be a root dictionary, but idiosyncratic vocabularies can be caught and sampled now. The root dictionary is mostly obvious.

You are not expected to understand this.