Title Evaluation of a model using local features and a

Title Evaluation Of A Model Using Local Features And A-PDF Download

  • Date:14 Sep 2020
  • Views:0
  • Downloads:0
  • Pages:9
  • Size:671.18 KB

Share Pdf : Title Evaluation Of A Model Using Local Features And A

Download and Preview : Title Evaluation Of A Model Using Local Features And A

Report CopyRight/DMCA Form For : Title Evaluation Of A Model Using Local Features And A


ICFP 2019 IOP Publishing, IOP Conf Series Earth and Environmental Science 415 2020 012029 doi 10 1088 1755 1315 415 1 012029. Evaluation of a model using local features and a codebook for. wood identification,SW Hwang1 K Kobayashi2 and J Sugiyama1 3. Research Institute Sustainable Humanosphere Kyoto University Kyoto 611 0011. Graduate School of Agricultural and Life Sciences Department of Biomaterial. Science The University of Tokyo Tokyo 133 8657 Japan. College of Materials Science and Engineering Nanjing Forestry University Nanjing. 210037 China,email sungwook hwang rish kyoto u ac jp. Abstract We designed a model for wood identification based on scale invariant feature. transform SIFT descriptors and a codebook A dataset consisting of cross sectional optical. micrographs of the Lauraceae family including 39 species was used for identification The bag. of features BOF model was superior to the model combined SIFT descriptors with a classifier. Among the four classifiers applied to both models the support vector machine SVM. achieved the best identification performance with 99 4 accuracy From the feature. importance calculated by the random forests and the inverse document frequency IDF score. it was also confirmed that cell corner based features are more informative for the identification. of Lauraceae In particular cell corners in vessels are not only important for species. identification but also reveal that they are species specific features The computer vision based. model was suitable for Lauraceae identification and enabled the quantification of anatomical. structures that are not possible with conventional visual inspection for wood identification. 1 Introduction, Studies related to wood such as botany cultural property science and archeology as well as wood. science start with the accurate identification of species This is because wood not only presents a. variety of characteristics by species but also contains species specific information The conventional. wood identification process is performed by visually inspecting the morphological structure of the. three orthogonal planes This standard visual inspection is the most reliable method to date. The vast variety of trees distributed throughout the world or a country and even in a city have. different anatomical characteristics by species The International Association of Wood Anatomists. IAWA list of microscopic features for hardwood identification is an international standard for wood. identification that consists of 221 feature codes 1 There are 163 feature codes to identify Japanese. hardwoods alone 2 Therefore long term training and experience are required to identify wood by. visual inspection Nevertheless there are certain limitations to this method Wood identification by. visual inspection is generally classifiable to the genus level and furthermore species that have the. identical anatomical characteristics from the feature code are not classifiable There are many species. with the same anatomical characteristics among trees classified as different species in plant taxa Such. difficult to identify species often cause social problems. To overcome the limitations of the conventional method many researchers have conducted. alternative methodological studies such as deoxyribonucleic acid DNA analysis 3 5 and. Content from this work may be used under the terms of the Creative Commons Attribution 3 0 licence Any further distribution. of this work must maintain attribution to the author s and the title of the work journal citation and DOI. Published under licence by IOP Publishing Ltd 1,ICFP 2019 IOP Publishing.
IOP Conf Series Earth and Environmental Science 415 2020 012029 doi 10 1088 1755 1315 415 1 012029. chemometrics 6 8 and they reported successful results However these approaches require a great. deal of effort to identify a species and are not suitable for processing large databases with various. We consider the computer vision based image classification technique to be a promising alternative. to overcoming the limitations of conventional methods Various machine learning algorithms have. been developed for classifying images from acquiring information on images These techniques have. also been applied to wood identification and include texture features local or global features and. neural network based models The gray level co occurrence matrix GLCM was a popular method for. extracting texture features from wood images 9 11 Tou et al 9 performed wood recognition with a. combination of GLCM texture and the Gabor filter Kobayashi et al 10 11 reported successful. results in wood recognition using textures extracted from low resolution computed tomography data. and stereographs This research suggested the applicability of the non destructive species classification. method for wood cultural properties via computer vision technology Wood classification by local. feature extraction and a codebook based method has also been performed 12 13 However in many. studies it is quite difficult to understand the model s achievements in relation to the morphological. structure of the wood, In this study the model was designed using local features and a codebook for wood identification. We evaluated the performance of the model using different identification schemes based on the. identification accuracy The feature importance was calculated in order to determine which anatomical. features are important for species identification, The Lauraceae family was used for wood identification Lauraceae is known to be a difficult family to. identify owing to their very large and complex species composition and morphological similarity The. wood blocks of 11 genera including 39 species were received from the RISH Xylarium Kyoto. University To make microscopy slides Figure 1 all the blocks were cut to 15 m thickness by a. sliding microtome The wood sections were then stained with safranin and embedded on the slide. These preparation processes are the same as those of the conventional method by visual inspection. However we prepared only the cross sectional slides and not the three orthogonal sections The cross. section images were acquired with OlympusTM 2 0 08 NA PlanApo objective lens using a BX51. optical microscope equipped with a DP73 charge coupled device CCD camera The original image. was in red green and blue RGB color and had a size of 4800 3600 pixels with a pixel resolution. To construct an image dataset the original images were converted to 8 bit grayscale and cropped to. a size of 3600 3600 pixels The images were then resized to pixel resolutions of 1 47 2 94 5 88 and. 11 76 m to determine the optimal resolution for Lauraceae identification Finally the dataset was. constructed with 1658 cross sectional images,3 Identification model. Figure 2 presents the schemes of the identification model The model was implemented within the. Figure 1 Cross sectional images of 39 species in the Lauraceae dataset. ICFP 2019 IOP Publishing, IOP Conf Series Earth and Environmental Science 415 2020 012029 doi 10 1088 1755 1315 415 1 012029. bag of features BOF framework based on local features and had three sub models. 3 1 Feature extraction, Local features were extracted from the images using the scale invariant feature transform SIFT.
algorithm 14 15 SIFT features are robust to changes in scale rotation and illumination and have. demonstrated its superior performance in a variety of image classification problems 16 18 All. images in the dataset were converted to SIFT descriptors after the keypoints were extracted by the. algorithm We implemented the SIFT algorithm by adopting the parameters of the number of layers in. each octave of two The Gaussian filter was applied to the image of each layer of 1 6 the contrast. threshold of 0 06 and the edge threshold of 10 When the feature extraction was completed each. image was represented by a SIFT descriptor which was a 128 dimensional vector. The algorithm extracted local features such as blobs corners and edges from the difference in. Gaussian images These SIFT features effectively catch wood fibers axial and ray parenchyma cells. and vessels which are the main anatomical features we observe in cross sectional optical micrographs. of wood in species identification 13 The blobs could be applied to all cells with lumen and the. corners were present on all cell walls between adjacent tissues In addition the edges could detect. vessels and rays Thus the local features that the SIFT algorithm detected from the wood were not. very different from the anatomical features observed by wood anatomists. 3 2 The model combined SIFT descriptors with a classifier. The first sub model is a simple model that combined SIFT descriptors with a classifier Model 1 in. Figure 2 The images with pixel resolutions of 0 74 m 1 47 m 2 94 m 5 88 m and 11 76 m. were used to determine optimal pixel resolution for Lauraceae identification Based on the. identification accuracy the optimal pixel resolution was determined and used in subsequent models. Three different classifiers k nearest neighbor k NN linear discriminant analysis LDA and. support vector machine SVM were used for data learning and species identification. 3 3 Bag of features model, The second sub model Model 2 is a BOF model 19 20 that converted SIFT keypoints into. codewords via clustering In this model the dataset was divided into training and test sets at a ratio of. 4 to 1 which were used for learning and identification respectively The images in the training set. were not represented by the SIFT descriptors and all the extracted keypoints were used to generate. codewords by mini batch k means clustering 21 implemented by the k means algorithm 22 with. a processing batch size of 100 Various numbers of clusters k size ranging from 100 to 1000 were. considered to determine the optimal k size Once the codebook was created images could be. quantified by the codewords to which all their keypoints belong In other words each image was. represented by a feature histogram with k size bins and each bin represented the number of keypoints. contained in the corresponding codeword in the image This process is called vector quantization and. is the core of the BOF model, For data learning k NN LDA and SVM classifiers were used as well as Model 1 The images in. the test set were also represented as feature histograms based on the codebook generated in the. training phase They were then identified by the learned classifiers. 3 4 Random forest classifier and feature importance. The last model is a random forests classifier which was applied to the BOF model Model 3 The. random forest RF was used for feature identification as well as species identification The RF is an. ensemble method for classification 23 It is a powerful classifier created by combining a number of. weak classifiers the decision trees in this model This classifier avoids overfitting and has a strong. generalization characteristic This is achieved by random sampling of observations and variables and. by combining multiple basic learners during the process of generating the decision trees. The identical training and test sets as the previous model were applied to the RF classifier The. training process in RF began with bootstrap aggregating bagging 24 a sampling technique for. creating tree learners The bagging created trees repeatedly by random sampling with replacement in. ICFP 2019 IOP Publishing, IOP Conf Series Earth and Environmental Science 415 2020 012029 doi 10 1088 1755 1315 415 1 012029. the same size as the training set In this process because certain images are duplicated in each tree. approximately one third of the images in the training were not selected for tree learning These images. are called out of bag OOB samples and they were used for the validation of each tree We. determined the optimal number of trees for Lauraceae identification by tracking the OOB errors the. mean of the validation errors of all tree learners from RFs with varying numbers of trees. RF and bagging differ in the way that they select features codewords to separate classes in the tree. Unlike the bagging the RF does not use all features for learning each tree and uses a subset of random. features To determine the optimal number of features for a subset we calculated the OOB error using. three different feature numbers k features k size root square of k features and binary log of k. features The RF provided us with information on the importance of features or variables in. classification which is what we really wanted to obtain from the RF. In addition we calculated the inverse document frequency IDF score which represents the rarity. of the features or codewords and compared it with the feature importance of RF The IDF was. calculated by the following equation, where dfj is the number of images containing feature j and N is the number of images Based on the. feature importance of RF and the IDF score we investigated which codewords were used as important. criteria for Lauraceae identification We then visualized the SIFT keypoints belonging to the codeword. in order to grasp their corresponding anatomical features. Figure 3 Identification accuracies of the three, different classifiers with reducing pixel resolutions.
Figure 4 Identification accuracies of the three,different classifiers with various codebook sizes. Figure 2 Schemes of the identification model,ICFP 2019 IOP Publishing. IOP Conf Series Earth and Environmental Science 415 2020 012029 doi 10 1088 1755 1315 415 1 012029. micrographs of the Lauraceae family including 39 species was used for identification The bag of features BOF model was superior to the model combined SIFT descriptors with a classifier Among the four classifiers applied to both models the support vector machine SVM achieved the best identification performance with 99 4 accuracy From the feature importance calculated by the random

Related Books