{"id":1408,"date":"2024-01-16T11:49:21","date_gmt":"2024-01-16T10:49:21","guid":{"rendered":"https:\/\/www.gis-hub.uzh.ch\/?p=1408"},"modified":"2025-05-19T10:56:13","modified_gmt":"2025-05-19T08:56:13","slug":"comparing-two-sets-of-point-data-iii","status":"publish","type":"post","link":"https:\/\/www.gis-hub.uzh.ch\/de\/comparing-two-sets-of-point-data-iii\/","title":{"rendered":"Wie vergleiche ich zwei Punktdatens\u00e4tze? (III)"},"content":{"rendered":"<p><strong>When we count points of two data sets per cell, we can compare their densities. By generating Chi expectation surfaces, we can compare the actual densities with the expected densities. Coming back to our geographic names with &#8220;wald&#8221; in Switzerland, we could compare whether these names are over- or underrepresented compared to all geographic names.<\/strong><\/p>\n\n\n\n<p class=\"has-background\" style=\"background-color:#ffd103b8\">For better understanding,&nbsp;<strong><em>QGIS commands<\/em><\/strong>&nbsp;are marked in bold italics.<\/p>\n\n\n\n<p>We start with a single grid and two layers containing counts per grid cell. One layer should contain the expected resp. underlying population, in our case the counts of all the geographical names per grid cell; the other layer contains the feature we are interested in comparing to the underlying population e.g., the geographical names containing &#8220;wald&#8221;. If you did part II of the point data tutorial, you will have to create a second layer with the counts of all the geographical names. Make sure to use the same grid for the counting. You don&#8217;t need to do any coloring yet.<\/p>\n\n\n\n<p><strong>1. Filter out zero counts cells<\/strong><br>For the calculation of the Chi value, we only choose grid cells whose count is bigger than 0, otherwise the formula does not work (<strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#ffd103\" class=\"has-inline-color\">\u21af<\/mark><\/strong> zero in the denominator <strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#ffd103\" class=\"has-inline-color\">\u21af<\/mark><\/strong>). To filter the values, we open the attribute table of each layer with counts (right click on the layer &#8211;&gt; <strong><em>Open Attribute Table<\/em><\/strong>). We click on the funnel symbol and filter the column of the counts according to <strong><em>Greater than (&gt;) 0<\/em><\/strong>:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"970\" height=\"397\" src=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/filter_greater0.png\" alt=\"\" class=\"wp-image-1415\" srcset=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/filter_greater0.png 970w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/filter_greater0-300x123.png 300w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/filter_greater0-768x314.png 768w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/filter_greater0-18x7.png 18w\" sizes=\"auto, (max-width: 970px) 100vw, 970px\" \/><figcaption class=\"wp-element-caption\">Filter out the grid cells with 0 counts.<\/figcaption><\/figure>\n\n\n\n<p>We then click on <strong><em>Select Features<\/em><\/strong> and then save the layer by right-clicking on it <em><strong>Export &#8211;&gt; Save Selected Features As\u2026<\/strong><\/em> We do the same for the other layer.<\/p>\n\n\n\n<p><strong>2. Join the counts into one layer<\/strong><br>To calculate the Chi value, we need the counts of both layers in the same attribute table. This means we have to do a join from the &#8220;wald&#8221; layer to one with all the geographical names. We right click on the layer with the &#8220;wald&#8221; counts and we choose <strong><em>Properties<\/em><\/strong>. In the navigation with the symbols on the left side, we choose <strong><em>Joins<\/em><\/strong>. We click on the green plus icon (<strong><em>+<\/em><\/strong>) to add a new join. In the newly opened window, we choose the following parameters:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignright size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"500\" height=\"617\" src=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/join.png\" alt=\"\" class=\"wp-image-1417\" style=\"width:421px;height:auto\" srcset=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/join.png 500w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/join-243x300.png 243w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/join-10x12.png 10w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/figure>\n<\/div>\n\n\n<p><strong><em>Join Layer<\/em><\/strong>: Layer with all the geographical names.<\/p>\n\n\n\n<p><strong><em>Join und Target field<\/em><\/strong>: We choose the id in both layers to join on.<\/p>\n\n\n\n<p><strong><em>Joined fields<\/em><\/strong>: The field we want to add from the layer with all the geographical names to the layer with only the &#8220;wald&#8221; names. <\/p>\n\n\n\n<p>If the ids match, the layer with the &#8220;wald&#8221; names gets an entry (with the counts from all the geographical names) in the newly added column for the respective grid cell\/row. <\/p>\n\n\n\n<p>We click <strong>OK<\/strong>, then <strong>Apply<\/strong> and again <strong>OK<\/strong>. The join is now listed in the Joins section of Properties. <\/p>\n\n\n\n<p><strong>3. Normalization<\/strong><br>Since the two distributions have a different number of features, we have to normalize the formula for the Chi value. To find out how much features there are we click on <strong><em>Vector <\/em>&#8211;&gt; <em>Analysis Tools<\/em> &#8211;&gt; <em>Basic Statistics for Fields<\/em><\/strong>. In the newly opened window, we choose the corresponding layer as <strong><em>Input Layer<\/em><\/strong> and the <strong><em>Field to calculate statistics on<\/em><\/strong>:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"836\" height=\"448\" src=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/basic_stats1.png\" alt=\"\" class=\"wp-image-1421\" style=\"width:582px;height:auto\" srcset=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/basic_stats1.png 836w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/basic_stats1-300x161.png 300w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/basic_stats1-768x412.png 768w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/basic_stats1-18x10.png 18w\" sizes=\"auto, (max-width: 836px) 100vw, 836px\" \/><\/figure>\n<\/div>\n\n\n<p>Then we click on <strong><em>Run<\/em><\/strong>. The output is a <strong><em>Log <\/em><\/strong>file where we are only interested in the <strong><em>&#8216;SUM&#8217;: &#8230;<\/em><\/strong> Write down this number, then do the same with the other layer. <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignright size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"27\" height=\"28\" src=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/field_calculator-1.png\" alt=\"\" class=\"wp-image-1435\" style=\"width:49px;height:auto\" srcset=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/field_calculator-1.png 27w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/field_calculator-1-12x12.png 12w\" sizes=\"auto, (max-width: 27px) 100vw, 27px\" \/><\/figure>\n<\/div>\n\n\n<p><strong>4. Calculate the Chi Value<\/strong><br>To calculate the Chi value, we open the attribute table of the layer with the &#8220;wald&#8221; names where both of the counts are stored (right click on the layer &#8211;&gt; <strong><em>Open Attribute Table<\/em><\/strong>). Click on <strong><em>Open Field Calculator <\/em><\/strong>(see icon on the right side of this paragraph). <\/p>\n\n\n\n<p>In the field calculator, we start by ticking the box <strong><em>Create a new field<\/em><\/strong>.  We give it a name (<strong><em>Output field name<\/em><\/strong>) e.g., Chi, and choose as type (<strong><em>Output field type<\/em><\/strong>) <strong><em>1.2 Decimal number (real)<\/em><\/strong>.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"721\" src=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/calculate_chi-1024x721.png\" alt=\"\" class=\"wp-image-1424\" style=\"width:746px;height:auto\" srcset=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/calculate_chi-1024x721.png 1024w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/calculate_chi-300x211.png 300w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/calculate_chi-768x541.png 768w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/calculate_chi-18x12.png 18w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/calculate_chi.png 1123w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<p>In the middle part, click on <strong>Fields and Values<\/strong> to choose the column names from your attribute table. <\/p>\n\n\n\n<p>The formula of the <a href=\"https:\/\/ieeexplore.ieee.org\/abstract\/document\/4376138\" target=\"_blank\" aria-label=\"Chi value (opens in a new tab)\" rel=\"noreferrer noopener\" class=\"ek-link\">Chi value<\/a> is: <\/p>\n\n\n\n<p class=\"has-text-align-center\"><strong>Chi = (<mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-secondary-color\">observed frequency<\/mark> \u2013 <mark style=\"background-color:rgba(0, 0, 0, 0);color:#800df3\" class=\"has-inline-color\">expected frequency<\/mark> ) \/ \u221a <mark style=\"background-color:rgba(0, 0, 0, 0);color:#800df3\" class=\"has-inline-color\">expected frequency<\/mark><\/strong><\/p>\n\n\n\n<p>We will add the sums we just retrieved to normalize the formula as a factor: The number of all the geographical names divided by the number of the &#8220;wald&#8221; names. In my case it is: <\/p>\n\n\n\n<p><strong>(((258064 \/ 9127) * <mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-secondary-color\">Anz_Wald_Ortsnamen<\/mark>) &#8211; <mark style=\"background-color:rgba(0, 0, 0, 0);color:#800df3\" class=\"has-inline-color\">Anz_alle_Ortsnamen<\/mark>) \/<\/strong> \u221a <strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#800df3\" class=\"has-inline-color\">Anz_alle_Ortsnamen<\/mark><\/strong><\/p>\n\n\n\n<p>We enter this formula in the field on the left side. You can use the mathematical operators just below the field or type them and the column names from the <strong>Fields and Values<\/strong> in the middle. If the formula is correctly written, you should see a number below the field where it says <strong><em>Preview. <\/em><\/strong>Click on <strong><em>OK<\/em><\/strong>. In the attribute table you should now see the new column with the Chi values. Save the changes by clicking on the icon with the disk and the red pen. <\/p>\n\n\n\n<p><strong>5. Color the grid cells according to the Chi Value<\/strong><br>A negative Chi value means that names with &#8220;wald&#8221; are underrepresented in this grid cell compared to all the geographical names, while a positive value means an over-representation. We can highlight this difference with a diverging color scheme. We right click on the layer with the Chi value, then we choose <strong><em>Properties &#8211;&gt; Symbology<\/em><\/strong>. <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignleft size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"827\" height=\"998\" src=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/diverging_color_scheme_chi-2.png\" alt=\"\" class=\"wp-image-1427\" style=\"width:596px;height:auto\" srcset=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/diverging_color_scheme_chi-2.png 827w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/diverging_color_scheme_chi-2-249x300.png 249w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/diverging_color_scheme_chi-2-768x927.png 768w, https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/diverging_color_scheme_chi-2-10x12.png 10w\" sizes=\"auto, (max-width: 827px) 100vw, 827px\" \/><figcaption class=\"wp-element-caption\">Color the cells according to their Chi value. <\/figcaption><\/figure>\n<\/div>\n\n\n<p>Choose a <strong><em>Graduated <\/em><\/strong>color scheme.<\/p>\n\n\n\n<p><strong><em>Value<\/em><\/strong>: Choose the Chi value.<\/p>\n\n\n\n<p><strong><em>Color Ramp:<\/em><\/strong>&nbsp;By clicking on the down arrow, you get a list of possible color ramps or you can even create your own. Further, you can invert any ramp. Since the Chi values are negative and positive, a <a href=\"https:\/\/colorbrewer2.org\/#type=diverging&amp;scheme=BrBG&amp;n=5\" class=\"ek-link\">diverging color scheme<\/a> makes sense e.g., <strong><em>Spectral<\/em><\/strong>. If you want 0 to be in the middle of the color scheme, you can tick the box <strong><em>Symmetric Classification<\/em><\/strong> and add 0 in the field <strong><em>Around<\/em><\/strong>. You might also want to reduce the <strong><em>Opacity <\/em><\/strong>to around 40%. Click on <strong><em>Apply<\/em><\/strong>, if you are happy with the map, click on <strong><em>OK<\/em><\/strong>, otherwise adapt the symbology.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-a89b3969 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/4_Chi_deutsch.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">German Version of the Tutorial (pdf)<\/a><\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>Wenn wir Punkte aus zwei Datens\u00e4tzen pro Zelle z\u00e4hlen, k\u00f6nnen wir ihre Dichte vergleichen. Indem wir den Chi-Wert berechnen, k\u00f6nnen wir die aktuelle mit der erwarteten Dichte vergleichen.<\/p>","protected":false},"author":2,"featured_media":1432,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_editorskit_title_hidden":false,"_editorskit_reading_time":0,"_editorskit_is_block_options_detached":false,"_editorskit_block_options_position":"{}","footnotes":""},"categories":[37,38,244,253],"tags":[44,48],"class_list":["post-1408","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-geodata","category-gis-hub","category-qgis","category-tutorial","tag-geographic-visualisation","tag-ortsnamen","entry"],"featured_image_src":"https:\/\/www.gis-hub.uzh.ch\/wp-content\/uploads\/2024\/01\/Chi_map.png","author_info":{"display_name":"Katia Soland","author_link":"https:\/\/www.gis-hub.uzh.ch\/de\/author\/ksolan\/"},"_links":{"self":[{"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/posts\/1408","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/comments?post=1408"}],"version-history":[{"count":20,"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/posts\/1408\/revisions"}],"predecessor-version":[{"id":2005,"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/posts\/1408\/revisions\/2005"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/media\/1432"}],"wp:attachment":[{"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/media?parent=1408"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/categories?post=1408"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gis-hub.uzh.ch\/de\/wp-json\/wp\/v2\/tags?post=1408"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}