settingsLogin | Registersettings

Weeding out street names from Names entities

0 votes

I'm finding that street names are showing up in the Names entities of the collection I'm testing. Since it would be time consuming to list them all in the kill list, is there anyway to include a wildcard in the kill lists (e.g. * St.)?

asked Jan 4, 2016 by Susan M.

1 Answer

0 votes

Thanks, Susan, for your question. Wildcard is not allowed in the kill list. Could you give us some example of street names recognized as personal names so that we may be able to improve the NER not to include them in future?

answered Jan 4, 2016 by Peter_Chan (2,770 points)

Thanks for your quick reply. The street names that came up as personal names are names that could be people (e.g. Thomson, Graham, Morgan, etc.) so I suppose I'm looking for an easy way to differentiate between names and locations.

We are enhancing our NER to cross reference Dpedia. We have very promising result with certain entity types such as diseases. However, there will still be many situations which need human judgement. "California" in one message may be the "State of California" and in another messages the geographical location "California".

ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.