Download

Abstract

Soybean seeds are a key ingredient for producing quality tofu. Conventional methods for assessing soybean seed quality for tofu are time-consuming and labor-intensive. This study employs hyperspectral imaging (HSI) and machine learning to rapidly predict gypsum tofu quality from soybean seeds. Two hundred soybean seed varieties were classified into four categories based on tofu quality using hierarchical clustering. Hyperspectral scans of the soybean seeds were captured in the 900-1700 nm range. Using the Extreme Gradient Boost (XGBoost) algorithm, ten critical wavelengths were identified that correlate with protein, carbohydrate, and oil contents. A Convolutional Neural Network (CNN) model was subsequently developed, trained on HSI data from the soybean categories. For new soybean seeds, this CNN model successfully categorized them into distinct quality classes with 96-99% accuracy. Further validation through tofu production demonstrated the model’s robustness in predicting key tofu quality parameters like yield, firmness and springiness. Overall, this pioneering research enabled rapid, non-destructive prediction of tofu quality from soybean seeds using HSI and CNN. With further refinements, this approach could revolutionize soybean seed quality assessment.


Figure 1: Classification of soybean seeds based on gypsum tofu quality using (A) Hierarchical clustering analysis (HCA) and (B) Principal Component Analysis (PCA) and the loading score of each component. The color of Class I, II, III, and IV are indicated with red, yellow, green, and blue, respectively.



Citation

Malik, A., Ram, B., Arumugam, D., Jin, Z., Sun, X., & Xu, M. (2024). Predicting gypsum tofu quality from soybean seeds using hyperspectral imaging and machine learning. Food Control, 160, 110357.