Mapping hardwood forests through a two-stage unsupervised classification by integrating Landsat Thematic Mapper and forest inventory data

Gang Shao; Benjamin P. Pauli; G. Scott Haulton; Patrick A. Zollner; Guofan Shao

doi:10.1117/1.JRS.8.083546

15 October 2014 Mapping hardwood forests through a two-stage unsupervised classification by integrating Landsat Thematic Mapper and forest inventory data

Gang Shao, Benjamin P. Pauli, G. Scott Haulton, Patrick A. Zollner, Guofan Shao

Author Affiliations +

Journal of Applied Remote Sensing, Vol. 8, Issue 1, 083546 (October 2014). https://doi.org/10.1117/1.JRS.8.083546

Abstract

Sound forest management requires accurate forest maps at an appropriate scale. Forest cover data developed at a national scale may be too coarse for forest management at a local level. We demonstrated a two-stage unsupervised classification, integrating Continuous Forest Inventory (CFI) data and Landsat imageries, to classify forest types for Indiana State Forests (ISF) and 8-km surrounding areas. In the first stage, an automatic unsupervised classification assisted by CFI data was applied in ISF. In the second stage, the resultant forest cover information from the first stage was used to expand the classification area into the 8-km surrounding areas. Splitting the classification procedure into two stages made it possible to expand the classification area beyond the coverage of the CFI data. This data-aided unsupervised classification approach increased the repeatability of forest mapping. The resultant map contains five forest types: conifer, conifer-hardwood, maple, mixed hardwood, and oak-hickory forests. The overall accuracy was 81.9%, and the total disagreement was 0.176. The accuracies of conifer, conifer-hardwood, maple, mixed hardwood, and oak-hickory forests were 81.6, 63.4, 75.0, 33.3, and 90%, respectively. This forest mapping technique is suitable for automated mapping of forest areas where extensive plot data are available.

1. Introduction

Remote sensing is an effective technology to extract and map spatially explicit information of land use and cover at different scales.¹^,² Forest cover maps are one of the products made using remotely sensed data and are essential for forest management, habitat monitoring, and biodiversity analysis.³^–⁶ General-purpose land cover maps developed at national and global scales usually describe forest information at a coarse level. For example, the National Land Cover Database (NLCD)⁷ for the United States contains three forest types: deciduous, evergreen, and mixed forests, a level-II classification defined by the United State Geological Survey (USGS).⁸ For the purposes of intensive forest management, habitat characterization, and forest health monitoring, it is essential to obtain more detailed forest information than the USGS (or Anderson) level-II classification can provide.⁹^–¹¹

The major challenge in land use or forest classification is to increase classification detail with satisfactory accuracy.¹²^–¹⁴ This is because forest classification can theoretically be made at any level but classification accuracy decreases with increasing classification detail. Fine-level forest cover maps with low classification accuracy are no better than coarse-level forest cover maps with high classification accuracy though different forest management activities may require different forest cover maps. A three-forest-class (or Anderson’s level-II) land cover map works well for national-scale forest assessment, but is insufficient for complex forest management at the state level, such as in Indiana where 95% of forests are deciduous.¹⁵ In this region, distinct deciduous hardwood forest types exist, requiring differing silvicultural regimes for management and maintenance.¹⁰^,¹⁶ However, detailed classification of hardwood forests is difficult due to the similarities in spectral reflectance, canopy structure, and spatial mixture of hardwood tree species.¹⁷

There is a clear need for a quality forest cover map in Indiana to assess the habitat availability for bat species that are at risk from white-nose syndrome.¹⁸ For example, the Indiana bat (Myotis sodalis) is listed as an endangered species by the U.S. Fish and Wildlife Service as well as the International Union for Conservation of Nature.¹⁹ Populations of Indiana bats have declined $> 50 %$ from 1960 to 2001²⁰ and are under considerable threat of increased declines due to white-nose syndrome.²¹ During the summer, Indiana bats mainly use hardwood and hardwood-pine forests²² and a quality forest cover map can assist in modeling current and future bat habitat.²³

There are several existing forest cover maps that contain more than three forest types at national or state scales. The Forest Cover Types map produced by the National Atlas of the United States²⁴ provides an alternative resource for obtaining a forest cover map in Indiana. This dataset was created based on advanced very high resolution radiometer and Landsat Thematic Mapper (TM) imagery acquired in 1991 and includes 25 types of forests with a 1-km spatial resolution. The classifications of hardwood forests, however, do not separate maple (Acer sp.) from mixed hardwood tree species groups despite the fact that maple is one of the most dominant tree genera in Indiana. Furthermore, there is no accuracy assessment information associated with the metadata. The Indiana Gap Analysis Project has also produced a land cover map.²⁵ The development of this land cover map focused upon habitat attributes but did not consider subclasses of hardwood forests. The overall accuracy of this map was just 70.98%. However, for the assessment of land use and land cover mapping, the USGS proposed a recommendation of minimum accuracy of 85%.⁸ For application in landscape quantification, a classification accuracy of $> 85 %$ is necessary.¹³ Although there is no specific accuracy requirement for the purpose of forest managements and habitat delineation, our objective was to develop a forest cover map with 85% or better classification accuracy. Existing land cover maps were not satisfactory in terms of classification details and accuracy; therefore, it was necessary to develop a more accurate and more detailed forest cover map to meet the needs of various spatial analyses, such as wildlife habitat simulations.

The algorithms of image data classification to derive a forest cover map generally include supervised and unsupervised approaches.²⁶ Supervised classifications require training a dataset to predefine the statistical parameters (such as maximum likelihood) or nonparametric statistical learning functions (such as neural network and support vector machine).²⁷^,²⁸ Unsupervised classifications (such as ISODATA and K-means clustering) generate spectral clusters based on the statistical information of the remote sensing imagery.²⁹^,³⁰ Supervised and unsupervised classifications each have their own strengths: the former involves more human input than the latter, whereas the latter is more repeatable than the former. Strictly speaking, both approaches require human experience and neither is absolutely repeatable in practice. It is ideal for a classification algorithm to have minimal human input while maintaining a high level of accuracy and repeatability.³⁰^–³³ Lang et al.³⁰ demonstrated a data-aided unsupervised classification (DUC) method that interfaced with sample data for labeling spectral classes into information classes. This automated classification approach is superior to the traditional supervised and unsupervised classification methods in terms of ease of use, classification accuracy, and repeatability. However, the use of DUC becomes impractical if there are insufficient ground sample data. Our access to thousands of geo-referenced plots of forest inventory from Indiana State Forests (ISF) provides us with an excellent opportunity to classify forest types with the DUC method. This study represents the first application of DUC to reveal insight about the usefulness of this automated approach in practice.

2. Study Area and Data Processing

This study focused on ISF and the 8-km surrounding areas (37°57′N to 40°54′N, 85°28′ W to 87°38′W). Eight-kilometer surrounding areas were implemented because they encompass the majority of Indiana bat movement from roost sites to foraging areas.³⁴^,³⁵ A total of 13 state forests with a total area of $9854 {km}^{2}$ were included in the study area (Fig. 1). Located in the central hardwood forest region, the forests in the study area are dominated by hardwood species, such as oak (Quercus sp.), hickory (Carya sp.), maple, and tulip poplar (Liridodendron tulipifera).

Fig. 1

The extents of Indiana State Forests (ISF) and the 8-km surrounding areas across Indiana. The area between the lines is covered by the Landsat TM path 21 constituting $\sim 97 %$ of the study area. The Landsat TM data were placed as the background with the RGB combination of bands 4, 3, and 2.

We downloaded cloud-free Landsat 5 TM data of paths 21 and 22, and rows 32, 33, and 34, acquired in April 2006, September 2008, and October 2010 from the USGS Earth Explorer data website (Table 1).³⁶ There were limited stand-replacing commercial harvests within the study area between 2006 and 2010. A comparison between NLCD 2006 and NLCD 2011 within the study area indicated that only 0.114% of forestland had changed on ISF and the 8-km surrounding areas. Therefore, the overall forest canopy structure remained generally intact during this period. These satellite image data capture spectral characteristics of tree species in spring and fall seasons. The use of different seasons of remotely sensed data has proven useful for improving classification accuracy for mapping tree species in temperate deciduous woodlands.³⁷ In the spring, the deciduous trees have varying levels of greenness due to differences in leaf-growing stages. In the fall, discrimination among deciduous species is possible due to differing colors and amounts of leaves. Therefore, the combination of the two-season datasets increases the ability to distinguish hardwood forest types in Indiana. Mosaics were made from images acquired on the same date and were all clipped with the boundary of our study area. Spring and fall TM image datasets were combined into a single dataset, resulting in two datasets covering 97 and 3% of the area by path 21 and path 22, respectively (Fig. 1). The data-aided classification method (discussed below) was applied to the classification of the path 21 image mosaic. The path 22 image mosaic was classified with a traditional unsupervised classification method due to insufficient plot data available for this small area.²⁶

Table 1

Landsat 5 Thematic Mapper data used in this study.

Year	Date	Path	Row	Cloud coverage
2006	April 28	21	32, 33, 24	0%
2008	September 24	21	32, 33, 24	0%
2010	October 16	21	32, 33, 24	0%
2010	October 7	22	33	0%

NLCD 2006 was used as a base map, which helped separate forest from nonforested areas (Fig. 2). This ensured the consistency between the newly developed land cover map and NLCD 2006 data for the nonforest categories. We clipped the image mosaics with forest polygons and derived image datasets that contained only forest pixels (Fig. 2).

Fig. 2

A flow chart of applying the two-stage, data-aided unsupervised classification method in forest cover mapping for ISFs and the 8-km surrounding areas.

Continuous Forest Inventory (CFI) plot data were collected on ISF by Indiana Department of Natural Resources (IDNR) between 2006 and 2010 (Fig. 3).³⁸ The sampling intensity was one plot for approximately every 40 acres. Each plot is in a circular shape with a radius of 7.3 m. Stand type was recorded and trees with a diameter at breast height of 5 in. and larger were measured on each plot. In this study, they were used as reference data ( $n = 2158$ ) for image classification (Table 2). We grouped all the CFI plots into five forest types: conifer forest, conifer-hardwood forest, maple forest, mixed hardwood forest, and oak-hickory forest based upon the surveyors’ forest type classification. The coniferous forest type included all the conifer-dominated plots. If the plot contained both hardwood and coniferous tree species, it was grouped into conifer-hardwood forest type. Maple forest plots were those dominated by maple trees. Plots would belong to oak-hickory forest type if they were dominated by oak and co-dominated by hickory. Mixed hardwood forest plots were those dominated with hardwood tree species other than maple, oak, or hickory. We collected data from an additional 321 plots for accuracy assessment³⁹^,⁴⁰ based on the CFI technique above, and 248 of these additional plots were located within ISF area. The 8-km surrounding areas were much greater than ISF in area, but contained only a small portion of reference plots due to difficult accessibility to private forestland.

Fig. 3

The spatial distribution of the Continuous Forest Inventory plots in ISF.

Table 2

The distribution of Continuous Forest Inventory plots among forest types in Indiana State Forest.

Forest type	Conifer	Conifer-hardwood	Maple	Mixed hardwood	Oak-hickory
Number of points	68	117	16	45	1912

3. Classification Methods

Band selection is required to improve efficiency and reduce redundancy. Different bands of Landsat 5 TM data have different features for forest mapping.⁴¹^,⁴² Band 1 has the ability to distinguish deciduous from coniferous vegetation. Band 2 is useful for assessing plant vigor. Band 3 is sensitive to vegetation slopes. Band 4 is related to forest biomass content. Bands 5, 6, and 7 reflect the moisture content of soil and vegetation.⁴³ We performed band-by-band visual examinations to eliminate bands with obvious noise due to high scattered energy and excluded band 6 because of its large pixel size. Only one band would be kept if some bands provide similar information for forest mapping to reduce redundancy.

The TM data analysis and classification were performed in Erdas Imagine 2010 (Ref. 44) and MATLAB® 2010a. The classification procedure included two stages (Fig. 2).

1. We classified ISF by applying the original DUC algorithm.³⁰ Unsupervised clustering was used to achieve preliminary spectral clusters, followed by CFI data assisted cluster labeling. Among 2158 CFI sample plots we used, 1912 were recorded as oak-hickory forest type (Table 2), which dominated the forestland in ISF. Maple, beech (Fagus sp.), basswood (Tilia sp.), and some other hardwood species are usually intermediate or overtopped in the forest stand.⁴⁵ We initially separated oak-hickory forest from the other forest types. CFI data were used to guide the recoding of oak-hickory forests based on the majority rule, meaning that if over half of the plots belonged to oak and hickory, this cluster would be labeled as oak-hickory forest (Table 2). Due to the extremely uneven distribution of the CFI data among different forest types, a large number of spectral clusters were required to distinguish oak-hickory forest and those forest types with rare CFI data. This procedure resulted in a forest cover map with two forest types: oak-hickory and other forests. Then, other forest types were split using the non-oak-hickory CFI plot data. For the classification of non-oak-hickory forest types, we used the relative-dominant rule: a spectral class was assigned to a forest type that had the most plots within the spectral cluster. If they had equal numbers of plots between any two hardwood plots, the spectral class was assigned to mixed hardwood forest if a cluster contained no coniferous plot. The classification at this stage resulted in a forest cover map within ISF.
2. Both ISF and the 8-km surrounding areas were considered at this stage, but a similar clustering procedure was applied. A large number of spectral clusters were required to identify the forest types with rare CFI data. However, the CFI data were insufficient to label large numbers of spectral clusters in the 8-km surrounding areas using original DUC algorithm because most of the CFI plots were located within the ISF. The clusters that did not overlap with CFI data could not be recoded or labeled with CFI data (Fig. 3). Instead of using CFI plot data for recoding, the resultant forest cover map for ISF from the first classification stage was used in this procedure. Each pixel of this forest cover map was used to label the spectral clusters into forest cover classes by the majority rule of the data-assisted unsupervised classification as discussed in stage 1. Such clustering-recoding was repeated as the number of spectral clusters increased until classification accuracy reached the maximum, resulting in a forest cover map for the entire study area.

Quantity, allocation, and total disagreements, which were developed by Pontius and Millones,⁴⁶ were calculated to qualify the classification results statistically in each stage. Quantity disagreement is defined as the amount of difference in the proportions of the categories between the resultant map and the reference data. Allocation disagreement is the difference in spatial allocation of the categories between the resultant map and the reference data. Total disagreement is the sum of quantity and allocation disagreements.⁴⁶ McNemar’s test was used to test the significant difference between the resultant matrices in stages 1 and 2.⁴⁷^,⁴⁸ The Z score was calculated with the following equation:

Eq. (1)

Z = \frac{| a - b |}{\sqrt{a + b}},

where

a

was the number of samples that were misclassified by the first stage classification but were correctly classified by the second stage classification, and

b

was the number of samples that were correctly classified by the first stage classification but were misclassified by the second classification. We assumed the results from the two stages were statistically significant (

p < 0.05

) if the

Z

value was

> 1.96

.

A complete land cover dataset was then produced by overlaying this forest cover map with the water layer derived with the original TM imagery and the existing NLCD 2006. The newly derived forest cover dataset is referred to as a five-forest-class (FFC) map, containing conifer forest, conifer-hardwood forest, maple forest, mixed hardwood forest, and oak-hickory forest.

4. Results

In the band selection procession, we selected nine bands from the Landsat 5 TM data for forest cover mapping, including bands 2, 3, 4, and 5 from the 2006 TM data, bands 4 and 5 from the 2008 TM data, and bands 4, 5, and 7 from the 2010 TM data. Bands 1 of all Landsat data were excluded because of the high scattered energy. Bands 6 were all eliminated due to the large pixel size and because the information they provided was not critical for forest mapping. One out of three of bands 2, 3, and 7 were kept to reduce the redundancy. All bands 4 and 5 were included because they are sensitive to the seasonal changes of the spectral reflectance of the forest species.

To separate oak-hickory forest from other forest types in stage 1, we started with 100 spectral clusters using unsupervised classification and increased the number of the clusters by a 50-cluster interval.²⁶ By adjusting the number of spectral clusters, the highest classification accuracy was reached when the number of spectral clusters was 200. For the classification of non-oak-hickory forest types, the classification accuracy reached the highest when the number of spectral clusters was 100. In this step, there would be individual spectral cluster/clusters that had no overlap with any CFI plot if there were too many spectral clusters. The FFC map within ISF was created at this stage [Fig. 4(a)]. In the second stage, 200 spectral clusters were classified to create an FFC map within ISF and the 8-km surrounding areas [Fig. 4(b)]. The accuracy was expressed with spatial agreement of forest types within shared ISF area between two forest cover maps from both stages.

Fig. 4

The ISF forest cover map (a), the forest cover map of the ISF and the 8-km surrounding areas (b), and the National Land Cover Database (NLCD) 2006 (c).

The FFC map resulting from the first stage of the classifications reached an overall accuracy of 74.2% (Table 3). The quantity and allocation disagreements were 0.052 and 0.206. The total disagreement was 0.258.⁴⁶ The oak-hickory forest type had the highest accuracy on the average of user’s and producer’s accuracies, followed by conifer, conifer-hardwood, maple, and mixed hardwood forests. The classification accuracies of conifer-hardwood, maple, and mixed hardwood forests were lower than desirable. The second stage of the classifications increased the overall accuracy to 81.9% (Table 4), better than the 78% overall accuracy of NLCD 2006 level II classification [Fig. 4(c)].⁴⁹ The total disagreement was 0.176 with a quantity disagreement of 0.028 and allocation disagreement of 0.148. Again, the oak-hickory forest type had the highest accuracy; conifer and maple forest types also had satisfactory accuracies. The mixed hardwood forest type had a rather low producer’s accuracy though its user’s accuracy was relatively high. McNemar’s test was applied to compare the results from the first and second stages of the classification. There were 73 plot samples that were used in the second stage of the classification but were not used in the first stage of the classification because they were located in the 8-km surrounding areas of ISF. As McNemar’s test requires the same quantity of samples from these two matrices, these 73 samples were not used in the McNemar’s test, among which, 7 samples were misclassified in the second stage of the classification. The $Z$ value of the McNemar’s test was 2.98, which was $> 1.96$ . Therefore, the difference of the results from the two classification stages was statistically significant at a 96% confidence level, which means that the result in the second stage of classification was significantly improved.

Table 3

An error matrix of classification for forests within Indiana State Forest following the stage one classification.

Classification	Reference					Total	UA (%)
Classification	Conifer	Conifer-hardwood	Maple	Mixed hardwood	Oak-hickory	Total	UA (%)
Conifer	39	6	0	1	1	47	83.0
Conifer-hardwood	1	19	0	2	7	29	65.5
Maple	1	0	14	0	11	26	53.9
Mixed hardwood	0	2	0	6	2	10	60.0
Oak-hickory	8	14	2	6	106	136	82.8
Total	49	41	16	15	127	248
PA (%)	79.6	46.3	87.5	40.0	83.5		$OA = 74.2$

Note: UA, user’s accuracy; PA, producer’s accuracy; OA, overall accuracy.

Table 4

An error matrix of classification for forests in Indiana State Forest and the 8-km surrounding areas following the stage two classification.

Classification	Reference					Total	UA (%)
Classification	Conifer	Conifer-hardwood	Maple	Mixed hardwood	Oak-hickory	Total	UA (%)
Conifer	40	4	0	1	2	47	85.1
Conifer-hardwood	1	26	0	5	15	47	55.3
Maple	1	0	12	0	2	15	80.0
Mixed hardwood	0	0	0	5	1	6	83.3
Oak-hickory	7	11	4	4	180	206	87.3
Total	49	41	16	15	200	321
PA (%)	81.6	63.4	75.0	33.3	90.0		$OA = 81.9$

Note: UA, user’s accuracy; PA, producer’s accuracy; OA, overall accuracy.

The FFC map shows the dominant compositions of oak-hickory and mixed hardwood forests in forest landscape, comprising 81% of the forested landscape in areas within ISF [Fig. 5(a)]. This result was consistent with forest composition data reported by IDNR (2008),¹⁰ which were obtained from field observations. Maple forest type in the FFC map constitutes 7% of ISF area, equivalent to the sum of 4% for maple and 3% for bottomland hardwoods (IDNR 2008) [Fig. 5(b)]. The combined area of conifer and conifer-hardwood mixed forests in the FFC map was slightly more than the pine forest area reported by IDNR (2008). The proportion of conifer-hardwood forest in the FFC map was reasonable when compared with independent forest inventory and analysis (FIA) data from USDA Forest Service.⁵⁰ The conifer-hardwood forest covered a much larger area in the FFC map than in NLCD 2006 data either inside or outside ISF and is much more detailed than the forest cover classifications in NLCD 2006 [Fig. 5(c)].

Fig. 5

A comparison of forest composition in area among NLCD 2006 (a), five-forest-class map (b), and report by IDNR (2008) (c) for ISF.

5. Discussion

This study is the first demonstration of the DUC algorithm³⁰ for a real-world classification exercise with an integration of remotely sensed and plot-based forest data. Lang et al.³⁰ designed and tested the DUC algorithm to classify several sample sites into four general land covers, including agriculture, forest, urban, and water. In this study, we applied this algorithm to a much larger site, the ISF with the 8-km surrounding areas. We focused on and split the forests into more detailed forest types, which was more difficult than the general land cover classification due to the similarities. The classification accuracy of such an unsupervised classification approach is determined mainly by imagery quality and sample data. The two-season Landsat TM datasets we used in this study provide richer spectral information than single-season imagery for the classifications of hardwood forests. The CFI plots used in this study were abundant though their distribution among forest types was uneven. All these factors contributed to the development of the FFC map that is reasonably consistent with field survey data. This shows that the integration of image data and forest field data has made the resultant forest cover map more realistic and objective than the use of image data alone.

The overall accuracy of the forest cover map derived at the second stage was unexpectedly greater than that at the first stage. The reason for this phenomenon may be that spectral classes derived from a larger forest area (ISF and the 8-km surrounding areas) have better representations to forest types than those from a smaller forest area (ISF). Because most sample points are located inside ISF, forest cover information within ISF may be more reliable than that outside ISF. The producer’s accuracy of maple forest type in the second stage was lower than that in the first stage; however, the user’s accuracy had significant increases. This means the FFC map created in the first stage overestimated the maple. The FFC of the second stage showed a lower quantity disagreement. It is also reasonable for the additional misclassified maple plots dropping into oak-hickory forest type because maples are shade tolerance species and are usually suppressed under the oak-hickory forests. The same trend of accuracies happened in mixed hardwood forest, where the producer’s accuracy decreased and the user’s accuracy increased. However, there was no confidential evidence to show that mixed hardwood was overestimated in the first stage. FFC in the second stage might have underestimated the mixed hardwood because the total number of mixed hardwood samples in the resultant map was much smaller than that in the reference data. The decrease of the allocation disagreement showed that FFC in the second stage would likely have a better quality.

The classification accuracy is also affected, in theory, by spatial mismatch between pixel coordinates and plot locations. The coordinates of the CFI plots used in this study were measured with handheld global positioning system (GPS) receivers and their errors are normally up to 10 m.⁵¹ The combination of the GPS error and TM image geometric error can be problematic for heterogeneous forest canopies. The coverage of this type of forest inventory data is available for limited forest areas, geographically restricting the applications of the data-assisted unsupervised classifications.

Given the first experiment with the classification method, we suggest that it should be broadly applied with the following considerations:

1. Temporal consistency: The time of image data to be used for classification should be close to the time of forest data collection. If there is a difference in time, areas that have changed between the two times should be excluded from image data classification.
2. Spatial correction: Both image data and plot data should be correctly geo-referenced. Geometric corrections may be needed though most remotely sensed data have been geo-referenced by the data provider. It is ideal that forest plot coordinates measured with GPS are differentially corrected. Because the data-aided unsupervised classification is a nonparametric approach, sample plots for classification can be purposely located in the middle of forest stands on the ground. Alternatively, plots located on the edges of forest stands can be excluded from their use for assisting image data classification.
3. Spatial representation: Sample plots should have an extensive coverage over the study area and, thus, only the first stage is needed for completing image data classification. Plot size needs to be big enough to have a good representation of tree composition from which forest types are derived. If the plot data are used exclusively for image data classification, the simplest attribute measurement is to record the name of the forest type by experienced forest surveyors. Only the canopy tree species is helpful to image data classification.
4. Map validation: The overall accuracy is the first but not the only consideration for the quality of a forest cover map. It is essential that a forest cover map be assessed with plot data that are not used for assisting image data classification.³⁹ The sample plots that are used for accuracy assessment should be randomly located on the ground. It is best to have a broad coverage over the study area with reference sample plots. It is important to balance between the user’s and producer’s accuracies.¹³^,⁵²

An important feature of the DUC algorithm is its repeatability, meaning that the classification procedure can be systematically modified by simply changing classification parameters and labeling rules if classification results are not satisfactory, thus the loops in the flow chart of the two-stage unsupervised classification (Fig. 2). Our experience indicates that an automated computer program for recoding can make the classification less labor intensive and reduce human errors in labeling processes despite hundreds of spectral classes. Analysts only need to have fundamental remote sensing training to implement this classification technique.

Forest inventory data, such as FIA, have been used in various ways to improve forest assessment together with remotely sensed imagery.⁴⁰^,⁵³^,⁵⁴ If researchers can access the exact FIA plot locations, the FIA-DUC approach can be broadly applied at the regional and national scales in the United States. Such a forest mapping procedure will help save time and money due to a reduction in the necessary ground validation.

6. Conclusions

This study demonstrates the first application of the DUC algorithm in dividing hardwood forests into three forest types in an objective fashion. The overall quality of the resultant forest cover map suggests that the DUC approach with forest inventory data is an effective and efficient method for mapping forest cover in Indiana. However, current ground data do not allow us to classify the hardwood forest into even more detailed levels with satisfactory accuracy. A stepwise classification procedure of species composition overcomes the difficulties caused by the extremely uneven distribution of ground data. The two-stage DUC algorithm successfully extends the mapping area to 8 km away from the plot-based forest data without jeopardizing classification accuracy. This forest mapping technique is suitable for mapping other forest areas where extensive plot data are available.

A forest cover map needs to be noise free if it is to be used for forest management activities in the field. In other words, the minimum mapping unit will likely be greater than pixel size of the remote sensing imagery at a proper scale based on the objective of the map to reduce salt-and-pepper or noise pixels. A forest cover map derived from pixel-based classifications can be filtered to remove noise pixels by using image processing algorithms, such as connected component analysis and morphology fundamentals processing. It is also possible to integrate spectral classes with image segmentation to obtain patch-style spectral classes, based on which labeling procedure is implemented. Such comprehensive experiments need systematic explorations in the future.

References

1.

S. E. FranklinM. A. Wulder, “Remote sensing methods in medium spatial resolution satellite data land cover classification of large areas,” Prog. Phys. Geogr., 26 (2), 173 –205 (2002). http://dx.doi.org/10.1191/0309133302pp332ra PPGEEC 0309-1333 Google Scholar

2.

S. D. JawakA. J. Luis, “Improved land cover mapping using high resolution multiangle 8-band worldview-2 satellite remote sensing data,” J. Appl. Remote Sens., 7 (1), 073573 (2013). http://dx.doi.org/10.1117/1.JRS.7.073573 1931-3195 Google Scholar

3.

W. B. Cohenet al., “Modelling forest cover attributes as continuous variables in a regional context with Thematic Mapper data,” Int. J. Remote Sens., 22 (12), 2279 –2310 (2001). http://dx.doi.org/10.1080/01431160121472 IJSEDK 0143-1161 Google Scholar

4.

J. L. OhmannM. J. Gregory, “Predictive mapping of forest composition and structure with direct gradient analysis and nearest-neighbor imputation in coastal Oregon, USA,” Can. J. For. Res., 32 (4), 725 –741 (2002). http://dx.doi.org/10.1139/x02-011 CJFRAR 0045-5067 Google Scholar

5.

K. M. Bergenet al., “Remote sensing of vegetation 3‐D structure for biodiversity and habitat: review and implications for lidar and radar spaceborne missions,” J. Geophys. Res.: Biogeosci., 114 (G2), 2005 –2012 (2009). http://dx.doi.org/10.1029/2008jg000883 Google Scholar

6.

E. H. Helmeret al., “Detailed maps of tropical forest types are within reach: forest tree communities for Trinidad and Tobago mapped with multiseason Landsat and multiseason fine-resolution imagery,” For. Ecol. Manage., 279 (1), 147 –166 (2012). http://dx.doi.org/10.1016/j.foreco.2012.05.016 FECMDW 0378-1127 Google Scholar

7.

J. A. Fryet al., “Completion of the 2006 National Land Cover Database for the Conterminous United States,” Photogramm. Eng. Remote Sens., 77 (9), 859 –864 (2011). PGMEA9 0099-1112 Google Scholar

8.

J. R. Anderson E. E. HardyJ. T. Roach, “A land use and land cover classification system for use with remote sensor data,” (1976). Google Scholar

9.

D. R. Breiningeret al., “Landcover characterizations and Florida scrub-jay (Aphelocomacoerulescens) population dynamics,” Biol. Conserv., 128 (2), 169 –181 (2006). http://dx.doi.org/10.1016/j.biocon.2005.09.026 BICOBK 0006-3207 Google Scholar

10.

Indiana Department of Natural Resources, Division of Forestry, “Increased emphasis on management and sustainability of oak-hickory communities on the Indiana State forest system, 2008–2027,” (2008) http://www.in.gov/dnr/forestry/files/fo-StateForests_EA.pdf Google Scholar

11.

R. K. Swihartet al., “The hardwood ecosystem experiment: a framework for studying responses to forest management,” (2013). http://www.nrs.fs.fed.us/pubs/42882 Google Scholar

12.

G. M. Foody, “Status of land cover classification accuracy assessment,” Remote Sens. Environ., 80 (1), 185 –201 (2002). http://dx.doi.org/10.1016/S0034-4257(01)00295-4 RSEEA7 0034-4257 Google Scholar

13.

G. F. ShaoJ. G. Wu, “On the accuracy of landscape pattern analysis using remote sensing data,” Landsc. Ecol., 23 (1), 505 –511 (2008). http://dx.doi.org/10.1007/s10980-008-9215-x LAECEH 0921-2973 Google Scholar

14.

S. Adelabuet al., “Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image,” J. Appl. Remote Sens., 7 (1), 073480 (2013). http://dx.doi.org/10.1117/1.JRS.7.073480 1931-3195 Google Scholar

15.

J. Gallion, “Indiana’s forest resource in 2011,” (2011) http://www.inwoodlands.org/indianas_forest_resource_in_20/ Google Scholar

16.

J. GallionC. Woodall, “The sustainability of Indiana’s forest resources,” (2010) http://www.in.gov/dnr/forestry/files/fo-SIFR(lowres).pdf Google Scholar

17.

R. Pu, “Broadleaf species recognition with in situ hyperspectral data,” Int. J. Remote Sens., 30 (11), 2759 –2779 (2009). http://dx.doi.org/10.1080/01431160802555820 IJSEDK 0143-1161 Google Scholar

18.

W. E. Thogmartinet al., “Population-level impact of white-nose syndrome on the endangered Indiana bat,” J. Mammal., 93 (4), 1086 –1098 (2012). http://dx.doi.org/10.1644/11-MAMM-A-355.1 JOMAAL 0022-2372 Google Scholar

19.

Myotis sodalis, The IUCN red list of threatened species, http://www.iucnredlist.org/details/14136/0 Google Scholar

20.

R. L. Clawson, “Trends in population size and current status,” The Indiana Bat: Biology and Management of an Endangered Species, 2 –8 Bat Conservation International, Austin, TX (2002). Google Scholar

21.

W. E. Thogmartinet al., “White-nose syndrome is likely to extirpate the endangered Indiana bat over large parts of its range,” Biol. Conserv., 160 (April), 162 –172 (2013). http://dx.doi.org/10.1016/j.biocon.2013.01.010 BICOBK 0006-3207 Google Scholar

22.

E. V. CallahanR. D. DrobneyR. L. Clawson, “Selection of summer roosting sites by Indiana bat (Myotissodalis) in Missouri,” J. Mammal., 78 (3), 818 –825 (1997). http://dx.doi.org/10.2307/1382939 JOMAAL 0022-2372 Google Scholar

23.

B. P. Pauli, “Nocturnal and diurnal habitat of Indiana and Northern long-eared bats, and the simulated effect of timber harvest on habitat suitability,” Purdue University, West Lafayette, Indiana, (2014). Google Scholar

24.

Forest Cover Types Info, http://www.nationalatlas.gov/mld/foresti.html Google Scholar

25.

Gap Analysis Bulletin No. 16, March 2009, http://www.gap.uidaho.edu/bulletins/16/Indiana.pdf Google Scholar

26.

J. R. Jensen, Introductory Digital Image Processing: A Remote Sensing Perspective, 3rd ed.Prentice Hall, NY (2005). Google Scholar

27.

Y. TarabalkaJ. A. BenediktssonJ. Chanussot, “Spectral-spatial classification of hyperspectral imagery based on partitional clustering techniques,” IEEE Trans. Geosci. Remote Sens., 47 (8), 2973 –2987 (2009). http://dx.doi.org/10.1109/TGRS.2009.2016214 IGRSD2 0196-2892 Google Scholar

28.

G. MountrakisJ. ImC. Ogole, “Support vector machines in remote sensing: a review,” ISPRS J. Photogramm. Remote Sens., 66 (3), 247 –259 (2011). http://dx.doi.org/10.1016/j.isprsjprs.2010.11.001 IRSEE9 0924-2716 Google Scholar

29.

D. S. LuQ. H. Weng, “A survey of image classification methods and techniques for improving classification performance,” Int. J. Remote Sens., 28 (5), 823 –870 (2007). http://dx.doi.org/10.1080/01431160600746456 IJSEDK 0143-1161 Google Scholar

30.

R. L. Langet al., “Optimizing unsupervised classifications of remotely sensed imagery with a data-aided labeling approach,” Comput. Geosci., 34 (12), 1877 –1885 (2008). http://dx.doi.org/10.1016/j.cageo.2007.10.011 CGEODT 0098-3004 Google Scholar

31.

J. Franklinet al., “Rationale and conceptual framework for classification approaches to assess forest resources and properties,” Remote Sensing of Forest Environments, 279 –300 2003). Google Scholar

32.

M. SongD. L. CivcoJ. D. Hurd, “A competitive pixel-object approach for land cover classification,” Int. J. Remote Sens., 26 (22), 4981 –4997 (2005). http://dx.doi.org/10.1080/01431160500213912 IJSEDK 0143-1161 Google Scholar

33.

B. Schmooket al., “A step-wise land-cover classification of the tropical forests of the Southern Yucatán, Mexico,” Int. J. Remote Sens., 32 (4), 1139 –1164 (2011). http://dx.doi.org/10.1080/01431160903527413 IJSEDK 0143-1161 Google Scholar

34.

S. W. MurrayA. Kurta, “Nocturnal activity of the endangered Indiana bat (Myotissodalis),” J. Zool., 262 (2), 197 –206 (2004). http://dx.doi.org/10.1017/S0952836903004503 JZOOAE 0952-8369 Google Scholar

35.

D. W. Sparkset al., “Foraging habitat of the Indiana bat (Myotissodalis) at an urban–rural interface,” J. Mammal., 86 (4), 713 –718 (2005). http://dx.doi.org/10.1644/1545-1542(2005)086[0713:FHOTIB]2.0.CO;2 JOMAAL 0022-2372 Google Scholar

36.

EarthExplorer, http://earthexplorer.usgs.gov/ Google Scholar

37.

R. A. Hillet al., “Mapping tree species in temperate deciduous woodland using time-series multi-spectral data,” Appl. Veg. Sci., 13 (1), 86 –99 (2010). http://dx.doi.org/10.1111/avsc.2010.13.issue-1 1402-2001 Google Scholar

38.

J. Gallion, “Report of Continuous Forest Inventory (CFI)—Summary of year 1–5 (2008–2012),” http://www.in.gov/dnr/forestry/files/fo-CFI_Report_2008-12.pdf Google Scholar

39.

R. G. Congalton, “A review of assessing the accuracy of classifications of remotely sensed data,” Remote Sens. Environ., 37 (1), 35 –46 (1991). http://dx.doi.org/10.1016/0034-4257(91)90048-B RSEEA7 0034-4257 Google Scholar

40.

L. LeefersN. Subedi, “Forest type classification accuracy assessment for Michigan’s State and National Forests,” North. J. Appl. For., 29 (1), 35 –42 (2012). http://dx.doi.org/10.5849/njaf.09-024 NJAFEN Google Scholar

41.

D. S. Luet al., “Relationships between forest stand parameters and Landsat TM spectral responses in the Brazilian Amazon Basin,” For. Ecol. Manage., 198 (1), 149 –167 (2004). http://dx.doi.org/10.1016/j.foreco.2004.03.048 FECMDW 0378-1127 Google Scholar

42.

G. F. Shaoet al., “Forest cover types derived from Landsat Thematic Mapper imagery for Changbai Mountain area of China,” Can. J. For. Res., 26 (2), 206 –216 (1996). http://dx.doi.org/10.1139/x26-024 CJFRAR 0045-5067 Google Scholar

43.

FAQ about the Landsat Missions, http://landsat.usgs.gov/best_spectral_bands_to_use.php Google Scholar

44.

Erdas Imagine, http://www.hexagongeospatial.com/products/remote-sensing/erdas-imagine Google Scholar

45.

P. S. Johnsonet al., The Ecology and Silviculture of Oaks, (2009). Google Scholar

46.

R. G. PontiusM. Millones, “Death to kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment,” Int. J. Remote Sens., 32 (15), 4407 –4429 (2011). http://dx.doi.org/10.1080/01431161.2011.552923 IJSEDK 0143-1161 Google Scholar

47.

Q. McNemar, “Note on the sampling error of the difference between correlated proportions or percentages,” Psychometrika, 12 (2), 153 –157 (1947). http://dx.doi.org/10.1007/BF02295996 0033-3123 Google Scholar

48.

A. H. Bowker, “A test for symmetry in contingency tables,” J. Am. Stat. Assoc., 43 (244), 572 –574 (1948). http://dx.doi.org/10.1080/01621459.1948.10483284 JSTNAL 0003-1291 Google Scholar

49.

J. D. Wickhamet al., “Accuracy assessment of NLCD 2006 land cover and impervious surface,” Remote Sens. Environ., 130 (15), 294 –304 (2013). http://dx.doi.org/10.1016/j.rse.2012.12.001 RSEEA7 0034-4257 Google Scholar

50.

What types of tree grow in Indiana, http://na.fs.fed.us/spfo/pubs/misc/in98forests/webversion/whatypes.htm Google Scholar

51.

M. G. WingA. EklundL. D. Kellogg, “Consumer-grade global positioning system (GPS) accuracy and reliability,” J. For., 103 (4), 169 –173 (2005). http://dx.doi.org/10.5849/sjaf.13-006 Google Scholar

52.

G. F. Shaoet al., “An explicit index for assessing the accuracy of cover class areas,” Photogramm. Eng. Remote Sens., 69 (8), 907 –913 (2003). http://dx.doi.org/10.14358/PERS.69.8.907 PGMEA9 0099-1112 Google Scholar

53.

M. D. Nelsonet al., “Combining satellite imagery with forest inventory data to assess damage severity following a major blow down event in northern Minnesota, USA,” Int. J. Remote Sens., 30 (19), 5089 –5108 (2009). http://dx.doi.org/10.1080/01431160903022951 IJSEDK 0143-1161 Google Scholar

54.

Y. Zhanget al., “Integration of satellite imagery and forest inventory in mapping dominant and associated species at a regional scale,” Environ. Manage., 44 (2), 312 –323 (2009). http://dx.doi.org/10.1007/s00267-009-9307-7 EMNGDC 1432-1009 Google Scholar

Biography

Gang Shao received his BS in biomedical information engineering from Northeastern University in China and his MS in forestry and natural resources from Purdue University, USA. He is currently a PhD student in forestry and natural resources at Purdue University, USA.

Benjamin P. Pauli received his BA in biology from Lawrence University, USA, and received his MS and PhD in quantitative ecology from Purdue University, USA. He is currently a postdoctoral researcher at Boise State University, USA.

G. Scott Haulton received a BS in English from the State University of New York College at Geneseo, USA, a BS in environmental and forest biology from the State University of New York College of Environmental Science and Forestry, USA, and an MS in wildlife science from Virginia Tech, USA. He is current a forestry wildlife specialist with the Indiana Department of Natural Resources, Division of Forestry.

Patrick A. Zollner received his BS in natural resources from University of Michigan, USA, his MS in wildlife ecology from Mississippi State University, USA, and his PhD in ecology from Indiana State University, USA. He is currently an associate professor of quantitative ecology in the Department of Forestry and Natural Resources, Purdue University, USA.

Guofan Shao received the PhD degree in ecology from the Chinese Academy of Sciences, Shenyang, China, and received postdoctoral education from the Department of Environmental Sciences, University of Virginia, USA. He is currently a professor with the Department of Forestry and Natural Resources, Purdue University, USA. His main research interests focus on remote-sensing data analysis and error propagation, landscape quantification, urbanization processes, and geospatial modeling.

CC BY: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

Citation Download Citation

Gang Shao, Benjamin P. Pauli, G. Scott Haulton, Patrick A. Zollner, and Guofan Shao "Mapping hardwood forests through a two-stage unsupervised classification by integrating Landsat Thematic Mapper and forest inventory data," Journal of Applied Remote Sensing 8(1), 083546 (15 October 2014). https://doi.org/10.1117/1.JRS.8.083546

Published: 15 October 2014

Access the abstract

JOURNAL ARTICLE
14 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

CITATIONS

Cited by 11 scholarly publications.

Explore citations on Lens.org

KEYWORDS

Image classification

Associative arrays

Earth observing sensors

Landsat

Remote sensing

Classification systems

Global Positioning System

1.

Introduction

2.

Study Area and Data Processing

Fig. 1

Table 1

Fig. 2

Fig. 3

Table 2

3.

Classification Methods

Eq. (1)

4.

Results

Fig. 4

Table 3

Table 4

Fig. 5

5.

Discussion

6.

Conclusions

References

Biography

Show All Keywords

Keywords/Phrases

Search In:

Publication Years