Entity types available in Mergeflow
Currently, Mergeflow's analytics detect the following entity types in texts:
Anatomy
Anatomical parts of organisms, as defined by the NIH's Medical Subject Headings (MeSH).
Company
Names of companies, including new, never-before-seen ones. Our algorithms usually merge several legal entities of the same organization into one company name. For example, "Siemens AG", "Siemens Austria GmbH", "Siemens USA LLC" etc. would all be considered "Siemens".
Country
Names of countries mentioned in texts, not the source. For example, if a source from Japan mentions "Switzerland" in an article, the country would be "Switzerland", not "Japan".
Disease
Diseases as defined by the 10th revision of the International Classification of Diseases, ICD-10.
Emerging Technology
Mergeflow has semantic models for more than 200 emerging or potentially disruptive technologies, from across various industries, including aerospace, agriculture, computing, energy, manufacturing, materials, medical, and transportation. A table with the current list of models is available here.
Event
Events in Mergeflow are trade fairs and conferences.
Investor
This includes mostly venture capital investors.
Location
Locations are mostly cities. As with countries, what is tagged is what is mentioned in a source, not the location of the source itself.
Material
Materials are chemical molecules. In order to determine whether or not something is a chemical molecule, we use the PubChem database as reference. If the term under consideration can be found in PubChem, we consider it a molecule.
Organism
For our definition of "organism", we use MeSH categories as reference.
Organization
Organizations in Mergeflow are mostly universities and R&D institutions. If an organization is recognizable as a company, it is tagged as "company", not as "organization".
Patent Class
Mergeflow uses the CPC classification system, down to the "subclass" hierarchy level (e.g. "A61B" or "C01D"). Mergeflow's algorithms tag all non-patent documents with patent classes. For patent documents, we use the patent class labels provided by the European Patent Office's DOCDB data set.
Person
Mergeflow detects names of people, including names it has never seen before.