Prosjektskjema

Prosjekttittel:	Automatic Information Extraction
Organisasjon/Bedrift:	Høgskolen i Østfold, IT
Kontaktperson(er):	Ky Van Ha
Prosjektbeskrivelse:	The amount of data available on Internet nowadays is anormous. Finding the facts of interest present there is very difficult, because this information is unstructured and mostly available in natural language form, so in addition to locating the documents of interest (web search engine) much time is usually spent on further reading and analysing of the texts in order to extract the facts of interest. Information extraction is the process of analysing unstructured texts and extracting the information relevant to some problem into a structured representation. In this project, we would like to build an automatic agent-based system to extract information from any types of documents. This process is needed to provide a semantic web approach. We will use GATE – the general architecture for text engineering – to perform our task.
Hva gjør denne oppgaven nyttig/interessant?
Evt. krav til studentenes forkunnskaper:	Java programming
Evt. krav til spesielt utstyr (hardware/software):	Ingen
Annet: