public class WikiParser extends Object
..
, etc.
Definitions:
[[...]] - wikilink,
[http:// site name] - hyperlink.Constructor and Description |
---|
WikiParser() |
Modifier and Type | Method and Description |
---|---|
static StringBuffer |
convertWikiToText(StringBuffer wiki_text,
LanguageType lang,
boolean b_remove_not_expand_iwiki)
Removes / expands interwiki, removes categories, expands wiki links.
|
static StringBuffer |
parseCurlyBrackets(StringBuffer text)
Removes texts withing curly brackets, e.g.
|
static StringBuffer |
parseDoubleApostrophe(StringBuffer text)
Removes douple apostrophes used in pairs, e.g.
|
static StringBuffer |
parseDoubleBrackets(StringBuffer text,
LanguageType lang,
boolean b_remove_not_expand_iwiki)
Removes and expands interwiki, categories, and wiki links in wiki texts.
1. |
static StringBuffer |
parseSingleBrackets(StringBuffer text)
Expands / removes hyperlinks.
|
static StringBuffer |
parseTripleApostrophe(StringBuffer text)
Removes triple apostrophes used in pairs, e.g.
|
static StringBuffer |
removeAcuteAccent(StringBuffer text,
LanguageType wiki_lang)
Removes sign of acute accent "'" for Russian wiki texts,
it is placed in the begin of article often e.g.
|
static StringBuffer |
removeBracketsInInterwiki(StringBuffer text)
Expands interwiki by removing interwiki brackets and language code,
e.g.
|
static StringBuffer |
removeBracketsInWikiLink(StringBuffer text)
Deprecated.
|
static StringBuffer |
removeCategory(StringBuffer text,
LanguageType lang)
Removes categories for selected language,
e.g.
|
static StringBuffer |
removeHTMLComments(StringBuffer text)
Removes all comments: <!-- ...
|
static StringBuffer |
removeInterwiki(StringBuffer text)
Removes interwiki, e.g.
|
static StringBuffer |
removePreCode(StringBuffer text)
Removes preformatted code (e.g.
|
static StringBuffer |
removeSourceCode(StringBuffer text)
Removes all source codes: <source ...
|
static StringBuffer |
removeXMLTag(StringBuffer text,
String tag)
Removes XML tag
with text till the next . |
static StringBuffer |
removeXMLTagCode(StringBuffer text)
Removes XML tag
with text till the next . |
public static StringBuffer removeInterwiki(StringBuffer text)
public static StringBuffer removeBracketsInInterwiki(StringBuffer text)
public static StringBuffer removeCategory(StringBuffer text, LanguageType lang)
public static StringBuffer removeXMLTag(StringBuffer text, String tag)
with text till the next
.public static StringBuffer removeXMLTagCode(StringBuffer text)
with text till the next
.
e.g. "a x+y
b" -> "a b".public static StringBuffer removeHTMLComments(StringBuffer text)
public static StringBuffer removePreCode(StringBuffer text)
public static StringBuffer removeSourceCode(StringBuffer text)
@Deprecated public static StringBuffer removeBracketsInWikiLink(StringBuffer text)
public static StringBuffer parseSingleBrackets(StringBuffer text)
public static StringBuffer parseDoubleBrackets(StringBuffer text, LanguageType lang, boolean b_remove_not_expand_iwiki)
b_remove_not_expand_iwiki
- if true then
Removes interwiki, e.g. "[[et:Talvepalee]] text" -> " text";lang
- defines parsed wiki language, it is needed to remove
category for the selected language, e.g. English (Category) or Esperanto
(Kategorio).public static StringBuffer parseCurlyBrackets(StringBuffer text)
public static StringBuffer parseDoubleApostrophe(StringBuffer text)
public static StringBuffer parseTripleApostrophe(StringBuffer text)
public static StringBuffer removeAcuteAccent(StringBuffer text, LanguageType wiki_lang)
public static StringBuffer convertWikiToText(StringBuffer wiki_text, LanguageType lang, boolean b_remove_not_expand_iwiki)
b_remove_not_expand_iwiki
- if true then removes interwiki,
e.g. "[[et:Talvepalee]] text" -> " text"; else expands interwiki by
removing interwiki brackets and language code,
e.g. "[[et:Talvepalee]] text" -> "Talvepalee text".lang
- defines parsed wiki language, it is needed to remove
category for the selected language, e.g. English (Category) or Esperanto
(Kategorio).Copyright © 2011-2016 Ubiquitous Knowledge Processing (UKP) Lab. All Rights Reserved.