mirror of
https://github.com/qgis/QGIS.git
synced 2025-04-18 00:03:05 -04:00
Tidying up help file
This commit is contained in:
parent
9f7dd46dd7
commit
75a1d02878
@ -6,6 +6,7 @@ Loads and displays delimited text files
|
||||
<a href="#csv">How the delimiter, quote, and escape characters work</a><br />
|
||||
<a href="#regexp">How regular expression delimiters work</a><br />
|
||||
<a href="#wkt">How WKT text is interpreted</a><br />
|
||||
<a href="#attributes">Attributes in delimited text files</a><br />
|
||||
<a href="#example">Example of a text file with X,Y point coordinates</a><br/>
|
||||
<a href="#wkt_example">Example of a text file with WKT geometries</a><br/>
|
||||
<a href="#python">Using delimited text layers in Python</a><br/>
|
||||
@ -68,7 +69,7 @@ It is safer to use an explicit coding if the QGis project needs to be portable.
|
||||
affects the alignment of data into fields and is equivalent to treating consecutive delimiters as a
|
||||
single delimiter. Quoted fields are never discarded.</li>
|
||||
<li>Decimal point is comma: if selected then commas are used as the decimal point in real numbers. For
|
||||
example "-51,354" is equivalent to -51.354.
|
||||
example <tt>-51,354</tt> is equivalent to -51.354.
|
||||
</li>
|
||||
</ul>
|
||||
<h5>Geometry definition</h5>
|
||||
@ -76,7 +77,7 @@ It is safer to use an explicit coding if the QGis project needs to be portable.
|
||||
<ul>
|
||||
<li>Point coordinates: each feature is represented as a point defined by X and Y coordinates.</li>
|
||||
<li>Well known text (WKT) geometry: each feature is represented as a well known text string, for example
|
||||
"POINT(1.525622 51.20836)". See details of the <a href="#wkt">well known text</a> format.
|
||||
<tt>POINT(1.525622 51.20836)</tt>. See details of the <a href="#wkt">well known text</a> format.
|
||||
<li>No geometry (attribute only table): records will not be displayed on the map, but can be viewed
|
||||
in the attribute table and joined to other layers in QGis</li>
|
||||
</ul>
|
||||
@ -88,7 +89,7 @@ It is safer to use an explicit coding if the QGis project needs to be portable.
|
||||
or degrees/minutes. QGis is quite permissive in its interpretation of degrees/minutes/seconds.
|
||||
A valid DMS coordinate will contain three numeric fields with an optional hemisphere prefix or suffix
|
||||
(N, E, or + are positive, S, W, or - are negative). Additional non numeric characters are
|
||||
generally discarded. For example "N41d54'01.54"" is a valid coordinate.
|
||||
generally discarded. For example <tt>N41d54'01.54"</tt> is a valid coordinate.
|
||||
</li>
|
||||
</ul>
|
||||
<p>For well known text geometry the following options apply:</p>
|
||||
@ -104,33 +105,41 @@ It is safer to use an explicit coding if the QGis project needs to be portable.
|
||||
</ul>
|
||||
|
||||
<h4><a name="csv">How the delimiter, quote, and escape characters work</a></h4>
|
||||
<p>Records are split into fields using three character sets: delimiter characters, quote characters,
|
||||
and escape characters. Quote and escape characters cannot be the same as delimiter characters - they
|
||||
<p>Records are split into fields using three character sets:
|
||||
delimiter characters, quote characters, and escape characters.
|
||||
Other characters in the record are considered as data, split into
|
||||
fields by delimiter characters.
|
||||
Quote characters occur in pairs and cause the text between them to be treated as a data. Escape characters cause the character following them to be treated as data.
|
||||
</p>
|
||||
<p>
|
||||
Quote and escape characters cannot be the same as delimiter characters - they
|
||||
will be ignored if they are. Escape characters can be the same as quote characters, but behave differently
|
||||
if they are.</p>
|
||||
<p>The delimiter characters are used to mark the end of each field. If more than one delimiter character
|
||||
is defined then any one of the characters can mark the end of a field. The quote and escape characters
|
||||
can override the delimiter character, so that it is treated as a normal character.</p>
|
||||
can override the delimiter character, so that it is treated as a normal data character.</p>
|
||||
<p>Quote characters may be used to mark the beginning and end of quoted fields. Quoted fields can
|
||||
contain delimiters and may span multiple lines in the text file. If a field is quoted then it must
|
||||
start and end with the same quote character. Quote characters cannot occur within a field unless they
|
||||
are escaped.</p>
|
||||
<p>Escape characters which are not quote characters force the following character to be treated normally
|
||||
<p>Escape characters which are not quote characters force the following character to be treated as data.
|
||||
(that is, to stop it being treated as a new line, delimiter, or quote character).
|
||||
</p>
|
||||
<p>If a quote character is also an escape character, then it can be represented in a quoted field by
|
||||
entering it twice. For example if ' is a quote character and an escape character, then the string
|
||||
'Smith''s Creek' will represent the value Smith's Creek.
|
||||
<p>Escape characters that are also quote characters have much more limited effect. They only apply within quotes and only escape themselves. For example, if
|
||||
<tt>'</tt> is a quote and escape character, then the string
|
||||
<tt>'Smith''s Creek'</tt> will represent the value Smith's Creek.
|
||||
</p>
|
||||
|
||||
|
||||
<h4><a name="regexp">How regular expression delimiters work</a></h4>
|
||||
<p>Regular expressions are mini-language used to represent character patterns. There are many variations
|
||||
of regular expression syntax - QGis uses the syntax provided by the <a href="http://qt-project.org/doc/qt-4.8/qregexp.html">QRegExp</a> class of the <a href="http://qt.digia.com">Qt</a> framework.</p>
|
||||
<p>In a regular expression delimited file each line is treated as a record. Each match of the regular expression in the line is treated as the end of a field.
|
||||
If the regular expression contains grouped expressions (eg "(cat|dog)")
|
||||
If the regular expression contains capture groups (eg <tt>(cat|dog)</tt>)
|
||||
then these are extracted as fields.
|
||||
If this is not desired then use non-capturing groups eg "(?:cat|dog)".
|
||||
If this is not desired then use non-capturing groups (eg <tt>(?:cat|dog)</tt>).
|
||||
</p>
|
||||
<p>The regular expression is treated differently if it is anchored to the start of the line (that is, the pattern starts with "^".
|
||||
<p>The regular expression is treated differently if it is anchored to the start of the line (that is, the pattern starts with <tt>^</tt>).
|
||||
In this case the regular expression is matched against each line. If the line does not match it is discarded
|
||||
as an invalid record. Each capture group in the expression is treated as a field. The regular expression
|
||||
is invalid if it does not have capture groups. As an example this can be used as a (somewhat
|
||||
@ -143,19 +152,46 @@ expression
|
||||
Lines less than 45 characters long will be discarded.
|
||||
</p>
|
||||
|
||||
|
||||
<h4><a name="wkt">How WKT text is interpreted</a></h4>
|
||||
<p>
|
||||
The delimited text layer recognizes the following
|
||||
<a href="http://en.wikipedia.org/wiki/Well-known_text">well known text</a> types -
|
||||
POINT, MULTIPOINT, LINESTRING, MULTILINESTRING, POLYGON, and MULTIPOLYGON. It will accept geometries with
|
||||
a Z coordinate (eg "POINT Z"), a measure ("POINT M"), or both ("POINT ZM").
|
||||
<tt>POINT</tt>, <tt>MULTIPOINT</tt>, <tt>LINESTRING</tt>, <tt>MULTILINESTRING</tt>, <tt>POLYGON</tt>, and <tt>MULTIPOLYGON</tt>.
|
||||
It will accept geometries with
|
||||
a Z coordinate (eg <tt>POINT Z</tt>), a measure (<tt>POINT M</tt>), or both (<tt>POINT ZM</tt>).
|
||||
</p>
|
||||
<p>
|
||||
It can also handle the PostGIS EWKT variation, in which the geometry is preceded by an spatial reference
|
||||
system id (eg "SRID=4326;POINT(175.3 41.2)"), and a variant used by Informix in which the WKT is
|
||||
preceded by an integer spatial reference id (eg "1 POINT(175.3 41.2)").
|
||||
system id (eg <tt>SRID=4326;POINT(175.3 41.2)</tt>), and a variant used by Informix in which the WKT is
|
||||
preceded by an integer spatial reference id (eg <tt>1 POINT(175.3 41.2)</tt>).
|
||||
In both cases the SRID is ignored.
|
||||
</p>
|
||||
|
||||
|
||||
|
||||
<h4><a name="attributes">Attributes in delimited text files</a></h4>
|
||||
<p>Each record in the delimited text file is split into fields representing
|
||||
attributes of the record. Usually the attribute names are taken from the first
|
||||
data record in the file. However if this does not contain attribute names, then they will be named <tt>field_1</tt>, <tt>field_2</tt>, and so on. QGis may override
|
||||
the names in the text file if they are numbers, or have names like <tt>field_#</tt>,
|
||||
or are duplicated.
|
||||
</p>
|
||||
<p>
|
||||
In addition to the attributes explicitly in the data file QGis assigns a unique
|
||||
feature id to each record. This is the line number in the source file on which
|
||||
the record starts.
|
||||
</p>
|
||||
<p>
|
||||
Each attribute also has a data type, one of string (text), integer, or real number.
|
||||
The data type is inferred from the content of the fields - if every non blank value
|
||||
is a valid integer then the type is integer, otherwise if it is a valid real
|
||||
number then the type is real, otherwise the type is string. Note that this is
|
||||
based on the content of the fields - quoting fields does not change the way they
|
||||
are interpreted.
|
||||
</p>
|
||||
|
||||
|
||||
<h4><a name="example">Example of a text file with X,Y point coordinates</a></h4>
|
||||
<pre>
|
||||
X;Y;ELEV<br />
|
||||
@ -167,7 +203,6 @@ X;Y;ELEV<br />
|
||||
<ul>
|
||||
<li> Uses <b>;</b> as delimiter. Any character can be used to delimit the fields.</li>
|
||||
<li>The first row is the header row. It contains the field names X, Y and ELEV.</li>
|
||||
<li>No quotes (") are used to delimit text fields.</li>
|
||||
<li>The x coordinates are contained in the X field.</li>
|
||||
<li>The y coordinates are contained in the Y field.</li>
|
||||
</ul>
|
||||
@ -200,39 +235,42 @@ filename="test.csv"<br />
|
||||
uri=QUrl.fromLocalFile(filename)<br />
|
||||
uri.addQueryItem("type","csv")<br />
|
||||
uri.addQueryItem("delimiter","|")<br />
|
||||
uri.addQueryItem("wktField","wkt")<br />
|
||||
# ... other delimited text parameters<br />
|
||||
layer=QgsVectorLayer(QString(uri.toEncoded()),"Test CSV layer","delimitedtext")<br />
|
||||
# Add the layer to the map<br />
|
||||
if layer.isValid():<br />
|
||||
QgsMapLayerRegistry.instance().addMapLayer( layer )<br />
|
||||
</pre>
|
||||
<p>This could be used to load the second example file above.</p>
|
||||
<p>The configuration of the delimited text layer is defined by adding query items to the uri.
|
||||
The following options can be added
|
||||
</p>
|
||||
<ul>
|
||||
<li><i>encoding=..</i> defines the file encoding. The default is "UTF-8"</li>
|
||||
<li><i>type=(csv|regexp|whitespace)</i> defines the delimiter type. Valid values are csv,
|
||||
regexp, and whitespace (which is just a special case of regexp). Default is csv.</li>
|
||||
<li><i>delimiter=...</i> defines the delimiters that will be used for csv formatted files,
|
||||
or the regular expression for regexp formatted files. Default is , for CSV files. There is
|
||||
<li><tt>encoding=..</tt> defines the file encoding. The default is "UTF-8"</li>
|
||||
<li><tt>type=(csv|regexp|whitespace)</tt> defines the delimiter type. Valid values are csv,
|
||||
regexp, and whitespace (which is just a special case of regexp). The default is csv.</li>
|
||||
<li><tt>delimiter=...</tt> defines the delimiters that will be used for csv formatted files,
|
||||
or the regular expression for regexp formatted files. The default is , for CSV files. There is
|
||||
no default for regexp files.</li>
|
||||
<li><i>quote=..</i> (for csv files) defines the characters used to quote fields. Default is "</li>
|
||||
<li><i>escape=..</i> (for csv files) defines the characters used to escape the special meaning of the next character. Default is "</li>
|
||||
<li><i>skipLines=#</i> defines the number of lines to discard from the beginning of the file. Default is 0.</li>
|
||||
<li><i>useHeader=(yes|no)</i> defines whether the first data record contains the names of the data fields. Default is yes.</li>
|
||||
<li><i>trimFields=(yes|no)</i> defines whether leading and trailing whitespace is to be removed from unquoted fields. Default is no.</li>
|
||||
<li><i>maxFields=#</i> defines the maximum number of fields that will be loaded from the file.
|
||||
Additional fields in each record will be discarded. Default is 0 - display all fields.
|
||||
<li><tt>quote=..</tt> (for csv files) defines the characters used to quote fields. The default is "</li>
|
||||
<li><tt>escape=..</tt> (for csv files) defines the characters used to escape the special meaning of the next character. The default is "</li>
|
||||
<li><tt>skipLines=#</tt> defines the number of lines to discard from the beginning of the file. The default is 0.</li>
|
||||
<li><tt>useHeader=(yes|no)</tt> defines whether the first data record contains the names of the data fields. The default is yes.</li>
|
||||
<li><tt>trimFields=(yes|no)</tt> defines whether leading and trailing whitespace is to be removed from unquoted fields. The default is no.</li>
|
||||
<li><tt>maxFields=#</tt> defines the maximum number of fields that will be loaded from the file.
|
||||
Additional fields in each record will be discarded. The default is 0 - include all fields.
|
||||
(This option is not available from the delimited text layer dialog box).</li>
|
||||
<li><i>skipEmptyFields=(yes|no)</i> defines whether empty unquoted fields will be discarded if they are empty (applied after trimFields). Default is no.</li>
|
||||
<li><i>decimalPoint=.</i> specifies an alternative character that may be used as a decimal point in numeric fields. Default is a point (full stop) character.</li>
|
||||
<li><i>wktField=fieldname</i> specifies the name or number (starting at 1) of the field containing a well known text geometry definition</li>
|
||||
<li><i>xField=fieldname</i> specifies the name or number (starting at 1) of the field the X coordinate (only applies if wktField is not defined)</li>
|
||||
<li><i>yField=fieldname</i> specifies the name or number (starting at 1) of the field the Y coordinate (only applies if wktField is not defined)</li>
|
||||
<li><i>geomType=(auto|point|line|polygon|none)</i> specifies type of geometry for wkt fields, or none to load the file as an attribute-only table. Default is auto.</li>
|
||||
<li><i>crs=...</i> specifies the coordinate system to use for the vector layer, in a format accepted by QgsCoordinateReferenceSystem.createFromString (for example "EPSG:4167"). If this is not
|
||||
specified then a dialog box may request this information from the user.</li>
|
||||
<li><i>quiet=(yes|no)</i> specifies whether errors encountered loading the layer are presented in a dialog box (they will be written to the QGis log in any case). Default is no.</li>
|
||||
<li><tt>skipEmptyFields=(yes|no)</tt> defines whether empty unquoted fields will be discarded (applied after trimFields). The default is no.</li>
|
||||
<li><tt>decimalPoint=.</tt> specifies an alternative character that may be used as a decimal point in numeric fields. The default is a point (full stop) character.</li>
|
||||
<li><tt>wktField=fieldname</tt> specifies the name or number (starting at 1) of the field containing a well known text geometry definition</li>
|
||||
<li><tt>xField=fieldname</tt> specifies the name or number (starting at 1) of the field the X coordinate (only applies if wktField is not defined)</li>
|
||||
<li><tt>yField=fieldname</tt> specifies the name or number (starting at 1) of the field the Y coordinate (only applies if wktField is not defined)</li>
|
||||
<li><tt>geomType=(auto|point|line|polygon|none)</tt> specifies type of geometry for wkt fields, or none to load the file as an attribute-only table. The default is auto.</li>
|
||||
<li><tt>crs=...</tt> specifies the coordinate system to use for the vector layer, in a format accepted by QgsCoordinateReferenceSystem.createFromString (for example "EPSG:4167"). If this is not
|
||||
specified then a dialog box may request this information from the user
|
||||
when the layer is loaded (depending on QGis CRS settings).</li>
|
||||
<li><tt>quiet=(yes|no)</tt> specifies whether errors encountered loading the layer are presented in a dialog box (they will be written to the QGis log in any case). The default is no.</li>
|
||||
</ul>
|
||||
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user