In vivo dielectric measuring instrument using picosecond pulse for detection of oral cancer

Background: Interaction of microwaves with water molecules has been found to be a very sensitive probe of molecular surroundings. It has been shown that the dielectric properties of saliva extracted from the mouth contain information about the health of tissues present in the mouth. Based on these principles, the present paper reports instruments used in vivo to derive information related to dielectric properties of saliva in a subject’s mouth. Methods: The instrument consisted of a pulse generator, transmission line and probe. The probe was made from a microstrip line that could be placed easily in the subject’s mouth. The reflected pulses were acquired by placing the probe in the mouth and also again outside the mouth. The difference in reflected pulses in and outside the mouth provided information about impedance of saliva. Pulse features were extracted to determine the status of oral cancer. Results: The experiments have been carried out on 125 subjects with healthy condition and with different stages of oral cancer. The results have been classified into five groups. The features extracted from the data have been classified using Linear Discriminant Analysis (LDA) using the Matlab toolbox. The clustering patterns for different cases were found to be distinct. Conclusion: The technique may be used to suggest the presence of oral cancer. It was also concluded that more cases under different backgrounds need to be studied before it will be possible to adopt the equipment in actual practice.


Introduction
The microwave group at Dr. Babasaheb Ambedkar Marathwada University (BAMU) has conducted extensive studies on liquid structure using the dielectric relaxation approach [1]. Dielectric parameters provide information regarding pairing of dipoles and their rotation in liquids. The experimental method used for the measurement of dielectric parameters along with conductivity was Time Domain Reflectometry (TDR) with sampling time in picosecond range. The experiments were conducted in vitro, i.e., samples were placed in a sampling cell. It was found that useful information in the frequency range of 100 MHz to 10 GHz can readily be obtained.
The above experimental method has also been carried out on many materials of biological significance. The multimers formed due to interaction of amino group and hydroxyl groups have been studied by the method [2].
As response of microwaves with systems is very sensitive to the interaction of water with other molecules present in the system, the technique has been used to diagnose clinically the status of healthy tissues in the human body. Water molecules in healthy tissues will behave differently than in infected tissues. It has been shown that dielectric parameters of saliva are significantly different for groups of healthy persons compared to persons with squamous cell carcinoma (SCC), and it is possible to identify these groups using permittivity data [3]. Similar results have been obtained on tissues extracted from the patient's mouth [4]. These studies were done in vitro placing extracted sample in the transmission line.
The objective of the present work was to report the study done in vivo to obtain information related to dielectric properties of saliva by placing a probe directly in subject's mouth. The paper gives the description of the equipment along with the method of analysis, which differs from the experiments done in vitro. The test of the equipment has been carried out on 125 subjects and it was found that the instrument has the potential capability to diagnose oral cancer.

Method of data collection
The experiments have been performed in human beings as per permission and directives from the ethical committee of BJ Medical College, Pune, India. Before taking observations, the written informed consent of the subject was taken and their case study was recorded on the prescribed Performa.
The patients in the wards of BJMC, Pune, Dept. of Surgery were examined and clinically diagnosed cases of oral squamous cell carcinoma were grouped into squamous cell carcinoma (SCC). The relevant history of each patient with SCC was recorded thoroughly. All these patients were evaluated for doi: 10.7243/2052-6962-2-1 routine haemogram. Incisional biopsy was taken from the oral lesion under local anesthesia, for confirmation of diagnosis. Only those squamous cell carcinoma patients who had not received any treatment before the study were selected. In vivo tissue readings of patients and controls were taken by probes directly from oral cavity by placing the probe on the lesion and/or mucosa. Clinical staging of each patient and histopathological grading were also done.
After screening and thorough clinical examination, those who did not have any renal and liver disorders, allergic conditions, autoimmune diseases and any other systemic disease or previous history of any major disease were selected for the group control C. These individuals did not have tobacco or any other habits and no obvious oral lesion.

Procedure
The procedure to perform the experiment on a given subject (person) was as follows: 1) The subject was asked to have an empty stomach in the morning.
2) The subject was asked to thoroughly brush his/her teeth without paste and gargle/clean his/her mouth. 3) The subject was asked to sit, head slightly down, and was asked not to swallow or move his tongue or lips during the reading process. The probe was directly placed on the tissue in totality and readings were taken in averages of 4. In each case, two sets of readings were taken, one of air and one of in vivo tissue. Acetone was used to clean the probe in between two readings and in between two subjects. 4) The procedure was repeated for all in vivo tissue samples and the wave form data stored in the computer. 5) The data were analyzed using the procedure given in the following Data Analysis section.

Biopsy procedure
A total number of 48 cases of oral squamous cell carcinoma (OSCC) were screened and all patients consented to biopsy. The written consent was obtained. The routine hematological examination and medical check-ups were done on all the selected cases, and all were found fit to undergo surgical intervention. Clinical photographs of the lesions were taken prior to the biopsy procedure. Biopsies were then taken from the representative sites using 5 mm punch, after achieving anesthesia by 2% lidococaine with 1:80000 adrenaline. The local anesthetic solution was injected well away from the biopsy site. Specimens of sufficient depth were taken so as to include intact covering epithelium, subepithelial connective tissue, submucosa and muscle. Then the surgical sites were sutured by ethicon suture and hemostasis was achieved. The tissue specimens were labeled and immediately fixed in 10% formaline for 24 hrs. The specimens were processed as per the procedure laid down by Bancroft and Stevens [5].
The processed tissues were embedded in paraffin wax using Leuckhart's 'L' blocks [5] as molds. The wax blocks were labeled accordingly. 6 sections out of each wax block of 5 µm were made for staining with haematoxylin and eosin (H & E) stain. The sections were stained by haematoxylin and eosin as per the procedure described by Bancroft and Stevens [5].
The detailed observations of all slides were made under a light microscope to see changes in epithelium, basement membrane, connective tissue and submucosa in order to histopathologically grade the squamous cell carcinoma according to Broder's grading system [6].
Accordingly three grades of oral squamous cell carcinoma were found: namely grade I (32 patients), Grade II (13 patients), and Grade III (3 patients).
Histopathological grading was done according to the Broder's numerical grading system [7], which depends upon the differentiation of tumor cells. A Grade I lesion is highly differentiated while grade IV is poorly differentiated. Using standard procedure, patients with their number were grouped as follows: 1. (

Description of the equipment
In TDR, a step-like pulse produced by a pulse generator propagates through the coaxial line and is detected from the sample section placed at the end of the line [8]. The reflected pulse also propagates through the same line. The difference between the reflected and incident pulses recorded in the time domain contains the signature of the sample. Details of the set-up in our experiments are as follows:

The pulse generator
The pulse generator is the main unit of the setup. The unit generates a pulse with a rise time in the range of picosecond and having a peak voltage of at least 1 V on 50 impedance transmission line. The PCI-3125 system [9] was used as handheld pulse generator. The unit was very compact having 130 gm weight and size of (153 X 76 X 3) mm. Before using the pulse generator, all parameters used for this purpose need to be optimized. All parameters were optimized individually by noise corresponding to different parameters set in an experiment. For this, experiments were performed on a healthy subject. In each experiment, noise is estimated by the standard deviation in base line of the reflected pulse. These parameters and their optimized values are summarized as follows: 1)The Peak Value of the pulse can be varied from 0 to 5 V. In our experiment, the value was set to be 1 V.

Microstrip mouth sensor
The mouth sensor was designed on a micro strip line. The sensor could be removed from the connector for cleaning purposes. On one side of the microstrip line, a conducting copper line of width about 1 mm was deposited. The width of the copper strip was chosen such that the characteristic impedance was 50 ohms. The length of the strip line was about 10 cm with the other end rounded up so that it could be placed easily in the mouth. Eight cm of the probe was covered with teflon tape, and 2 cm was kept open. While doing a measurement, care was taken to keep this open area in the mouth in such a way that it touched the area of interest in the subject's mouth. The other side of the microstrip line was used as a ground. An IBM laptop computer was used as dedicated system for the TDR system. The pulse generator was plugged into the system along with the dedicated software. The complete setup with all components is shown in Figure 1. A typical operating window is shown in Figure 2.

The data analysis
It was found that there are certain limitations in performing the experiments in vivo. The probe placed in the subject's mouth can only be placed for a limited time. The probe cannot be held steady because the subject cannot hold his mouth steady for a long time. The number of averages in vitro (n) should be the largest possible n for the best signal to noise ratio, but in vivo n must be practical. Two warm up pulses along with four averages required about four minutes. This time was found to be practical for the experiments.
Due to this limitation, the noise in recorded signals becomes significant. When the time domain data are transformed to frequency domain, one gets noisy frequency domain spectra. It was not possible to determine reliable values of dielectric parameters from the spectra. Due this limitation, it was decided not to transform time domain data to frequency domain, and instead to determine dielectric parameters. The features were directly extracted from time domain signals using the procedure as follows: 1. Two tdr files were created. The first file contained data for the experiment having the probe in the air or without putting probe in the mouth. The second file contained data having the probe in the patient's mouth. The probe was placed in such a way that the open part of the transmission line touched the suspected cancerous place of the mouth. Examples of these files are shown in Figures 3a and 3b. Incident and reflected pulses can be seen clearly in these figures. The reflected pulses get distorted, when the probe is placed in contact with the subject's mouth. These files are named as *.tdrR and *.tdrX. R corresponds to probe in air and X corresponds to probe in the mouth. 2. As the reflected pulse carried the relevant information from the sample and incident and reflected pulses were well separated, the reflected pulse was extracted from the files for further analysis. The Figures 4a and 4b show the extracted reflected pulses. These are named *.refR and *.refX files. 3. It was observed that the reflected pulse becomes approximately constant after some time. It is safe to remove the additional data after several intervals. It was decided after analysis that the starting point of the data was 650 and the end point of the data was selected as 6650. These files are named as *.ref1R and *.ref1X. 4. The initial baselines in both *.ref1R and *.ref1X were shifted to zero. The average values were determined by taking an average of the first ten points. These are called as *.reffR and *.reffX.. An example of these files is shown in Figure 5. 5. The difference between the *reffR and *.reffX is that the *.reffX files contain the information regarding the samples in contact with the probe. In our case, the samples are saliva in the mouth. It is expected that the saliva in noncancerous and cancerous cells will have different properties. The values of *.reffR and *.reffx were determined after every 200 points. These values are called as vectors, VR and VX. Values of VR and VX were averaged by taking 10 neighboring points. A feature vector p is computed by using the following expression: (1) The components of the feature vector p represent the reflection coefficient at different representative points of the pulse. It will be zero at the beginning and become constant P= (VR-VX)/(VR+VX)  at the end of the pulse. The end values will be dependent on the conductivity of the sample. A larger value means more conductivity in the sample. These components of p are used as feature vectors of the sample and can be used for diagnosis. 6. The feature vector p is extracted for each set of measurements using the five steps mentioned above. These feature vectors are used as inputs to Linear Discriminant Analysis (LDA) [10]. In the technique, features are classified using statistical parameters such as means, covariance etc. The dimensionality of the feature vectors is reduced and projected into two dimensional classifier spaces. The feature vectors may be classified by their clustering in the space. The features corresponding to two different groups will form two different clusters. The quality of the feature vectors will be dependent on the clusters. If clusters are well separated, the quality of the feature vectors will be good enough to enable us to use them for classifications of groups. A typical example of p-features extracted for different groups is given in Figure 6. Codes for LDA were written using Matlab. As discussed earlier, the measurements have been classified in three categories and five groups as follows: Group g1: Subjects with no tobacco using habits Group g2: Subjects with tobacco using habits Group g3-g5: Subjects with known cases of cancer. As mentioned in the previous section, these are further classified in three categories, as grade I (g3) , II (g4) and III (g5).
The LDA were used to classify the above three known clinically classified groups.

Results and discussions
The clusters corresponding to group 1 with other groups are shown in Figures 7a-7d. In these figures, the center of each cluster is represented by a black circle and rectangular boundaries represent the most likely region for the cluster around the center.

Sample data Points
Reflected pulse in mV as follows: 1. From Figure 7a, it can be seen that points under group g1 are well separated from the points under group g2. Only one point from group 1 is overlapping with group g2 and g3 points from group g2 is overlapping with group 1. This means 1.3% of clear cases may be confused with group g2, whereas 4% under group g2 may be confused with group g1. This adds to overall 5.3% confusion between group g1 and g2. 2. The clusters corresponding to group g1 and g3 are shown in Figure 7b. In this case, only one point from each group is overlapping, which leads to 2.7% error in recognition of cases under the groups. 3. The clusters under group g1 and g4 are well separated, as seen in Figure 7c, which leads to 100% recognition. One should notice that 13 cases were available under group g4. One can see the similar trend in group g1 and g5, which both have 100 % recognition. Here, only three cases were available in group g5. However, it should be noted that centers of clusters are well separated in each group. This shows that the features of p are useful in recognition of the group for any given measurement. 4. Clusters corresponding to all five groups are shown in Figure 8. It can be seen that the centers of all groups are well separated. The group g5 is clearly well separated from all other groups. The group g1 belonging to clear cases is well separated from groups, g3, g4 and g5, belonging to cancerous cases. All cancerous cases have a tendency to fall on the left side, whereas clear cases and tobacco-related cases fall on the right side. From the above discussion, we may summarize as follows: a. If a feature vector points to a region on the right side of the space, it will either be a clear case, g1, or a case under group g2. b. The lower part belongs to group g1, and upper part to g2.
If a feature vector points to left side of the region, it belongs to a cancerous case. It is interesting to note that the lower or upper side belongs to a case with more  cancerous condition, i.e. grade 3 and grade 2, respectively. The middle left belongs to the group g3, i.e. grade 1. c. From clustering behavior, it is possible to predict almost 100% between clear cases and cancerous cases by seeing the position of the corresponding point in the cluster.

Conclusion
It was found that the handheld time domain reflectometry can easily be used in vivo. The frequency domain spectra were noisier than the data in the time domain. The noise can be reduced by increasing the number of pulses averaged, which will, in turn, increase acquisition time. The subject will not remain comfortable if the signal integration time is too long.. The optimum time was found to be about four minutes, corresponding to two warm-up pulses and four averages. Another difficulty in the presently-used TDR versus the conventional TDR was finding an appropriate thickness of the sample. As in conventional TDR, SubMiniature Version A (SMA) lines are used to determine the effective length of the sample. Here we have used a microstrip line, an unconventional transmission line. The surface of the strip line remains in contact with the sample. It is very difficult to estimate the effective sample length, and there is a need to compensate for surface effect, as well.
Due to the difficulties mentioned above, it was decided not to adopt the conventional method of material characterization based on values of dielectric parameters and conductivity. A new set of feature vectors were extracted from reflected pulses acquired in the time domain. The data suggested that it was possible to diagnose cancerous cells with the feature vectors by using the LDA technique. However, the method needs to be cross validated and detection limits determined using more cases to enhance confidence in the results.