MEL: Metadata Extractor &amp; Loader

Rodríguez Méndez, Sergio J.; Omran, Pouya G.; Haller, Armin; Taylor, Kerry

MEL: Metadata Extractor & Loader

Date

2021

Authors

Rodríguez Méndez, Sergio J.

Omran, Pouya G.

Haller, Armin

Taylor, Kerry

Abstract

The metadata and content-based information extraction tasks from heterogeneous file sets are pre-processing steps of many Knowledge Graph Construction Pipelines (KGCP). These tasks often take longer than necessary due to the lack of proper tools that integrate several complementary extraction methods and properties to get a rich output set. This paper presents MEL, a Python-based tool that implements a set of methods to extract metadata and content-based information from unstructured information encoded in different source document formats. The results are generated as JSON files, which can: (a) optionally be stored in a document store, and (b) easily be mapped to RDF using a variety of tools such as J2RM. MEL supports more than 20 different file types, making it a versatile tool that aids pre-processing tasks as part of a KGCP based on comprehensive configurable settings.

Keywords

Data Analysis Pipeline, Data Pre processing, Information Extraction, Knowledge Graph Construction, Metadata Extraction

URI

https://hdl.handle.net/1885/733801523

Collections

ANU Research Publications

Type

Conference paper

Entity type

Publication

Full item page

Cultural advice

MEL: Metadata Extractor & Loader

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

Cultural advice

MEL: Metadata Extractor &amp; Loader

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

MEL: Metadata Extractor & Loader