A new Synchrotron X-ray microdiffraction (μXRD) image screening method based on federated learning (FL) has recently been proposed by a research team led by Prof. Zhu Yongxin from the Shanghai Advanced Research Institute (SARI) of the Chinese Academy of Sciences to enhance the screening while safeguarding data privacy.
Synchrotron μXRD harnesses the dual particle nature of X-rays, akin to traditional XRD technology, to understand more about the structure of crystalline materials. In traditional XRD analysis, interferences happen when scattered waves are in phase or out of phase resulting in bright (high-intensity) spots or peaks in a scanned diffraction pattern on an aerial detector. Contrary to traditional XRD, which typically has a spatial resolution of several hundred micrometers to several millimeters, Synchrotron μXRD employs X-ray optics to concentrate the excitation beam to a small point on the sample surface, allowing for the analysis of minute features on the sample. Consequently, high flux, adjustable well-defined wavelength, and superior collimation of Synchrotron radiation enable Synchrotron μXRD to give enhanced sensitivity and resolution of diffraction peaks than traditional laboratory XRD.
The micro-diffraction technique is often applied to smaller or non-homogeneous samples with different compositions, lattice strains, or crystallite orientations.
Industrial minerals are subjected to synchrotron X-ray microdiffraction technologies to determine their crystal impurities in terms of crystallinity and potential impurities. Before being processed and stored, the enormous amounts of photos that μXRD services produce must be filtered. However, Synchrotron μXRD cannot work with massive image inflow in a short period of time. It will also be a challenging and expensive affair for humans to annotate every image.
At the same time, service users are reluctant to provide their original experimental images, there aren’t enough efficient labeled examples to train a screening model. Even industrial users’ privacy concerns about using μXRD services are a barrier to the development of precise μXRD image screening.
There are several organizations, and each one provides data that could be compiled into a coherent and large database. This database can be used to train a big data model. But industrial imagery could include sensitive and private information about users that is generally not authorized to be released outside of the establishments where they were created, particularly when ‘effective de-identification’ is not assured. Due to competing interests, each institution may also be regrettably unwilling or unable to share its own data with others. It may be challenging to construct reliable Synchrotron μXRD image screening without enough and a variety of datasets. Isolated or scant resources can cause misclassified results. For conducting industrial material testing using commercial data, bias or a lack of variety in images creates the need for a shared technology that does not need data centralization. This can further prevent the parameters gained by each institution from being used dishonestly to encrypt the data and models of another institution by forming an alliance in compliance with an all-side protocol. The use of federated learning among industrial users is one way to address this problem.
Federated learning takes machine learning models to the data source as opposed to the data coming to the model. This method, often referred to as collaborative learning, enables large-scale model training on data that is still scattered throughout the devices where it was originally collected. Federated learning unites multiple computing devices into a decentralized system that enables the various data collection devices to help train the model. This is advantageous because federated learning is able to mitigate such privacy issues to some extent by keeping device data locally to train the local model, whereas conventional machine learning methods for image classification at device interfaces tend to offer a risk of a privacy breach.
Using the local data from the client, each device trains its own copy of the model, and then sends the parameters/weights from each model to a master device, or server, which aggregates the parameters and updates the global model. Then, until the required degree of accuracy is obtained, this training procedure is repeated. In a nutshell, the concept underlying federated learning is that only model-related updates are ever transferred between devices or parties, never any training data.
To increase the accuracy of federated learning, the researchers used domain-specific physical information. They then implemented a sampling method with new client sampling algorithms after taking into account the uneven data distributions in the actual world. In order to address the erratic communication environment between federated learning clients and servers, a hybrid training architecture was eventually developed.
Extensive research revealed that machine learning models’ accuracy increased from 14% to 25% and that sharing data characteristics across users or apps without compromising commercially sensitive information is possible.
This Synchrotron X-ray microdiffraction image screening technology powered by federated learning capabilities will aid in the removal of non-technical barriers to data sharing. This includes saving expenses for training specialists with domain knowledge, saving the work time of experts without compromising efficiency on intelligent classification, preserving the privacy of local clients, and utilizing sample information from different clients and organizations. Apart from that, it also encourages the use of unsupervised machine learning that doesn’t require vast troves of annotated image data, unlike supervised machine learning. Researchers say by employing their methodology, edge devices on the client side can be equipped with federated learning software packages or even deployed with customized hardware. Once the software (or hardware) is ready, instead of depending on talent for annotations, the users can have their images of industrial samples labeled intelligently and automatically by the federated learning paradigm when data flows into the pipeline without human intervention.
The researchers published their findings on Synchrotron X-ray microdiffraction using federated learning, and inference in IEEE Transactions on Industrial Informatics.