Entity types available in Mergeflow
Mergeflow detects and extracts various entities from the contents it collects. You can see the entities in tag clouds, for example. Above tag clouds, there is a toggle that lets you select the entity type you find interesting:
Currently, Mergeflow's analytics detect the following entity types in texts:
Anatomy
Anatomical parts of organisms, as defined by the NIH's Medical Subject Headings (MeSH).
Company
Names of companies, including new, never-before-seen ones. Our algorithms usually merge several legal entities of the same organization into one company name. For example, "Siemens AG", "Siemens Austria GmbH", "Siemens USA LLC" etc. would all be considered "Siemens".
Country
Names of countries mentioned in texts, not the source. For example, if a source from Japan mentions "Switzerland" in an article, the country would be "Switzerland", not "Japan".
Disease
Diseases as defined by the 10th revision of the International Classification of Diseases, ICD-10.
Emerging Technology
Mergeflow has semantic models for more than 200 emerging or potentially disruptive technologies, from across various industries, including aerospace, agriculture, computing, energy, manufacturing, materials, medical, and transportation.
Event
Events in Mergeflow are trade fairs and conferences.
Investor
This includes mostly venture capital investors.
Location
Locations are mostly cities. As with countries, what is tagged is what is mentioned in a source, not the location of the source itself.
Material
Materials are chemical molecules. In order to determine whether or not something is a chemical molecule, we use the PubChem database as reference. If the term under consideration can be found in PubChem, we consider it a molecule.
Organism
For our definition of "organism", we use MeSH categories as reference.
Organization
Organizations in Mergeflow are mostly universities and R&D institutions. If an organization is recognizable as a company, it is tagged as "company", not as "organization".
Patent Class
Mergeflow uses the CPC classification system, down to the "subclass" hierarchy level (e.g. "A61B" or "C01D"). Mergeflow's algorithms assign patent classes to all non-patent documents that it collects. For patent documents, we use the patent class labels provided by the European Patent Office's DOCDB data set.
Person
Mergeflow detects names of people, including names it has never seen before.