public class WPOSRu extends Object
Constructor and Description |
---|
WPOSRu() |
Modifier and Type | Method and Description |
---|---|
static POS |
checkIfSuchPOSExist(String pos_name) |
static POSText |
guessPOS(StringBuffer text)
The POS should be extracted from the texts, e.g.
|
static POS |
guessPOSWith2ndLevelHeader(String page_title,
String pos_title,
StringBuffer text)
The POS should be extracted from the text.
|
static boolean |
isSecondLevelHeaderWordNotPOS(String str)
Gets true, if str is known header, e.g.
|
static POSText[] |
splitToPOSSections(String page_title,
LangText lt)
page_title - word which are described in this article 'text'
|
public static boolean isSecondLevelHeaderWordNotPOS(String str)
public static POSText[] splitToPOSSections(String page_title, LangText lt)
lt
- .text will be parsed and splitted,
.lang is not using now, may be in future...
1) Split the following text to "lead I" and "leat II"
2) Extracts part of speech "гл" from "lead II"
== lead I == English text1 == lead II== ===Морфологические и синтаксические свойства===" {{гл en reg|lead}}";todo isPOSHeader() (remove acce'nt -> accent) or guessPOS
public static POSText guessPOS(StringBuffer text)
noun: ===Морфологические и синтаксические свойства=== {{сущ en|слоги=lead|lead|leads}} verb: ===Морфологические и синтаксические свойства=== {{гл ru 4b-ся {{гл ru 8b/b^ {{гл ru 5c'^-т adjective: ===Морфологические и синтаксические свойства=== {{прил ru 1*a adverb: ===Морфологические и синтаксические свойства=== {{adv ru|слоги={{по-слогам|ра|но|ва́|то}}|или=предикатив|или-кат=предикативы|}} {{adv-ru| Наречие, неизменяемое. Old formatting ===Морфологические и синтаксические свойства=== {{СущМужНеодуш1c(1) {{СущЖенНеодуш8a Существительное, ... {{прил ia}} {{парадигма-рус // old formatting (>500, < 1000 pages) |шаблон=Гл11b/c {{Гл1a
public static POS guessPOSWith2ndLevelHeader(String page_title, String pos_title, StringBuffer text)
page_title
- word, name of the article, e.g. "lead"pos_title
- extracted 2nd level title, e.g. "lead I", "lead II", or "Adverb" (old style)Copyright © 2011-2016 Ubiquitous Knowledge Processing (UKP) Lab. All Rights Reserved.