HiRE – A heuristic approach for user generated record extraction

dc.contributor.authorChandrakanth, S.
dc.contributor.authorSanthi Thilagam, P.
dc.date.accessioned2026-02-06T06:39:22Z
dc.date.issued2016
dc.description.abstractUser Generated Content extraction is the extraction of user posts, viz., reviews and comments. Extraction of such content requires the identification of their record structure, so that after the content is extracted, proper filtering mechanisms can be applied to eliminate the noises. Hence, record structure identification is an important prerequisite step for text analytics. Most of the existing record structure identification techniques search for repeating patterns to find the records. In this paper, a heuristic based approach is proposed. This method uses the implicit logical organization present in the records and outputs the record structure. © Springer International Publishing Switzerland 2016.
dc.identifier.citationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, Vol.9581, , p. 33-37
dc.identifier.issn3029743
dc.identifier.urihttps://doi.org/10.1007/978-3-319-28034-9_4
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/32260
dc.publisherSpringer Verlag service@springer.de
dc.subjectHeuristics
dc.subjectRecord boundary
dc.subjectRecord extraction
dc.subjectUser posts
dc.subjectWeb content mining
dc.titleHiRE – A heuristic approach for user generated record extraction

Files