|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectbb.io.FileParser
public class FileParser
Many file formats consist of lines of data, with tokens of data on each line being separated by a constant set of delimiters. Familiar examples are tab, space, and comma delimited files. This class was written to aid the parsing of such file types.
You simply construct an instance for the desired file,
along with regular expressions for the delimiter token(s) and nondata lines (e.g. comment or blank lines).
Then you may repeatedly call readDataLine
and process the data.
When finished, call close
.
Warning: parsing of files like tab, space, and comma delimited files may be a lot more complicated if the tokens themselves may contain any of the token delimiters. In this case, you will need to know how the delimiter is escaped so that it can appear inside a token (e.g. Excel may put double quotes around tokens).
This class is not multithread safe.
Field Summary | |
---|---|
private File |
file
|
private ParseReader |
in
|
private int |
lastLineNumber
|
private Pattern |
nondataLinePattern
|
private Pattern |
tokenDelimiterPattern
|
Constructor Summary | |
---|---|
FileParser(File file,
String tokenDelimiterRegexp,
String nondataLineRegexp)
Constructor. |
Method Summary | |
---|---|
void |
close()
Closes all resources associated with the parsing. |
String |
getLocation()
Returns the location (line # and file path) associated with the previous call to readDataLine . |
boolean |
isNonDataLine(String line)
|
String[] |
readDataLine()
Reads the next line of data for the file, parses all the tokens on that line (using tokenDelimiterRegexp), and returns them. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private final File file
private final ParseReader in
private final Pattern tokenDelimiterPattern
private final Pattern nondataLinePattern
private int lastLineNumber
Constructor Detail |
---|
public FileParser(File file, String tokenDelimiterRegexp, String nondataLineRegexp) throws IllegalArgumentException, IllegalStateException, SecurityException, IOException, UnsupportedEncodingException, PatternSyntaxException
tokenDelimiterRegexp
- regular expression to match token delimiters
(e.g. "[ ]+|[\\t,]" matches one or more spaces, or a single tab or comma)nondataLineRegexp
- regular expression to nondata lines
(e.g. "#.*|\\s*" matches any line which starts with '#' or which is empty or all whitespace);
may be null in which case every line is treated as a data line
IllegalArgumentException
- if file is null, does not exist,
is a directory, or if it refers to a file that cannot be read by this application;
tokenDelimiterRegexp == null
IllegalStateException
- if file holds more than Integer.MAX_VALUE
bytes (which cannot be held in a java array)
SecurityException
- if a security manager exists and its SecurityManager.checkRead(java.lang.String) method denies read access to file
IOException
- if an I/O problem occurs
UnsupportedEncodingException
- if the default char encoding used by ParseReader is not supported (this should never happen)
PatternSyntaxException
- if either regex's syntax is invalidMethod Detail |
---|
public String[] readDataLine() throws IOException
IOException
public boolean isNonDataLine(String line)
public String getLocation() throws IllegalStateException
readDataLine
.
Typically use this method when reporting errors associated with the data obtained from that call.
IllegalStateException
- if getLocation called before readDataLine has ever been calledpublic void close()
close
in interface Closeable
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |