Package org.languagetool.tokenizers
Class SRXSentenceTokenizer
java.lang.Object
org.languagetool.tokenizers.SRXSentenceTokenizer
- All Implemented Interfaces:
SentenceTokenizer
,Tokenizer
- Direct Known Subclasses:
SimpleSentenceTokenizer
Class to tokenize sentences using rules from an SRX file.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionSRXSentenceTokenizer
(Language language) Build a sentence tokenizer based on the rules in thesegment.srx
file that comes with LanguageTool.SRXSentenceTokenizer
(Language language, String srxInClassPath) -
Method Summary
Modifier and TypeMethodDescriptionfinal void
setSingleLineBreaksMarksParagraph
(boolean lineBreakParagraphs) final boolean
Tokenize the given string to sentences.
-
Field Details
-
srxDocument
private final net.loomchild.segment.srx.SrxDocument srxDocument -
language
-
parCode
-
-
Constructor Details
-
SRXSentenceTokenizer
Build a sentence tokenizer based on the rules in thesegment.srx
file that comes with LanguageTool. -
SRXSentenceTokenizer
- Parameters:
srxInClassPath
- the path to an SRX file in the classpath- Since:
- 3.2
-
-
Method Details
-
tokenize
Description copied from interface:SentenceTokenizer
Tokenize the given string to sentences.- Specified by:
tokenize
in interfaceSentenceTokenizer
- Specified by:
tokenize
in interfaceTokenizer
-
singleLineBreaksMarksPara
public final boolean singleLineBreaksMarksPara()- Specified by:
singleLineBreaksMarksPara
in interfaceSentenceTokenizer
-
setSingleLineBreaksMarksParagraph
public final void setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs) - Specified by:
setSingleLineBreaksMarksParagraph
in interfaceSentenceTokenizer
- Parameters:
lineBreakParagraphs
- iftrue
, single lines breaks are assumed to end a paragraph; iffalse
, only two ore more consecutive line breaks end a paragraph
-