2021 Fiscal Year Annual Research Report
Development of AI models for seed and insect classification
Publicly Offered Research
Project Area | Excavating earthenware: Technology development-type research for construction of 22nd century archeological study and social implementation |
Project/Area Number |
21H05355
|
Research Institution | Kumamoto University |
Principal Investigator |
|
Project Period (FY) |
2021-09-10 – 2023-03-31
|
Keywords | machine learning / deep learning / small-dataset training / archaeology / impression method |
Outline of Annual Research Achievements |
The main focus of our project is on the automation of late Jomon pottery indentations by using deep learning. The main challenge in this project is the fact that we do not enough samples from the Jomon period to train a model. As there is a lack of reliable species-identified data from Jomon pottery identations, we created a dataset using soft X-ray images of modern-day seeds and insects on clay tablets. This dataset contains seven different types of objects, which we use to train deep learning models. We believe that a model trained on modern-day seeds will be able to accurately classify the identations from the Jomon period. Our focus is in two parts: First, trainning accurate models using different DL techniques; Second, preparing pre-processing techniques to clean the images.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
This year, we characterized the dataset and study the steps necessary to train accurate deep-learning models. Firstly, we verified the minimum amount of data required to train our models. With this information, we tested different deep learning architectures and assessed their accuracy. Once the best architecture was selected, we focused on improving its accuracy. We did an ablation study using different techniques for improving accuracy such as "hyper-parameter optimization", "model ensemble", "image augmentation" and "test time augmentation (TTA)". Each technique was tested individually and combined to evaluate their impact in performance. The end result was a model that allowed data to be classified with more than 90% accuracy.
|
Strategy for Future Research Activity |
In the following year, our focus will be in pre-processing the data to achieve a higher accuracy in the Jomon dataset. Despite our initial results being able to have high accuracy in the research dataset, this accuracy is not reflected in the archeological dataset. We believe that this is mostly because there are differences in the data contained in the two datasets. Image characteristics such as illumination, size, and noise may be interfering in the accuracy of the archeological dataset. This year we will study ways of treating these images to be similar to the images used to train our model. We will investigate heuristic approaches, as well as machine-learning based techniques, for transfer characteristics from images. We will also increase the number of classes in our model.
|