• Users Online: 534
  • Print this page
  • Email this page

ORIGINAL ARTICLE Table of Contents  
Ahead of print publication
Automated follicular assessment using a novel two-dimensional ultrasound-based solution

1 Philips Research India, Philips Innovation Campus, Manyata Tech-Park, Bengaluru, Karnataka, India
2 Philips Research India, Philips Innovation Campus, Manyata Tech-Park, Bengaluru, Karnataka; KLA Corporation, Chennai, Tamil Nadu, India
3 Gunasheela Surgical and Maternity Hospital, Bengaluru, Karnataka, India

Click here for correspondence address and email

Date of Submission20-Oct-2020
Date of Decision09-Dec-2020
Date of Acceptance15-Dec-2020
Date of Web Publication13-May-2021


Background: High intra- and interobserver variability in the follicular assessment using two-dimensional (2D) ultrasound (US) is still a concern. To solve this issue, we have developed a novel software solution, which automatically provides follicles' count and their diameters using 2D US images obtained by a manual sweep of an ovary. The primary objective of this study was to compare the result of the automated solution with a manual 2D US-based assessment. Methods: In the first phase, multiple follicular US sweeps were collected from 54 subjects; these sweeps were used to develop the software. In the second phase, data from 10 subjects were collected for validation of the developed solution. During each phase, for follicles ≥5 mm, their count and diameters were recorded by the sonologist using 2D US. Results: For the total follicle count, a high correlation (0.787) was observed between the solution and manual assessment. The 95% limits of agreement between the two methods were in the range of 4.232 to −4.258. The two methods had an excellent correlation (0.817) for the measurement of mean follicular diameter. However, the solution had a tendency to underestimate the mean diameter by an average of 1.725 mm (±2.16 mm). The limits of agreement between the two methods for mean diameter measurement were from 2.508 to −5.960 mm. Conclusion: This study validates the feasibility of our solution for automatic assessment of follicle count and diameter with accuracy comparable to the 2D US-based manual assessment. We further observed that the solution's performance is better than known intra- and interobserver variability of the manual assessment. We recommend further validation of the solution to confirm these initial results and potential time gain with an automated assessment.

Keywords: Automation, computer-assisted, image processing, ovarian follicle, Assisted reproductive techniques, ultrasonography.

How to cite this URL:
Firtion C, Ramachandran G, Nellur Prakash SP, Hiwale S, Vajinepalli P, Manyam I, Gunasheela D. Automated follicular assessment using a novel two-dimensional ultrasound-based solution. J Med Ultrasound [Epub ahead of print] [cited 2021 Oct 26]. Available from: http://www.jmuonline.org/preprintarticle.asp?id=315941

  Introduction Top

Infertility is a worldwide problem and is estimated to touch around 15% of couples at some point of their lives.[1] The age-standardized prevalence rate of female infertility has shown a significant increase from 1366.85 per 100,000 in 1990 to 1571.35 per 100,000 in 2017, which comes to a 0.37% increase per year.[1] In a similar time period, the absolute number of couples affected by infertility has grown up from 42 million in 1990 to 118 million in 2017.[1],[2] Infertility treatment generally involves an ovarian stimulation; where under the influences of drugs, multiple follicles are recruited simultaneously; however, these follicles grow at different rates. Therefore, assisted reproductive technology (ART)-based methods require a regular and careful monitoring of follicles. The total number of ovarian follicles (antral follicle count) and their dimensions are two important parameters, which are closely monitored during ovarian stimulation procedures.

Ultrasound (US) imaging is the most preferred method for the monitoring of ovarian stimulation. The transvaginal route is the most commonly used. Serial US scans are done during the course of ovulation induction to track ovarian follicles' growth. Conventionally, this is done using two-dimensional (2D) US where a clinician manually counts and measures follicles' dimensions. However, there is a lack of consensus on standard protocols for measurement of follicular diameter,[3],[4],[5],[6] this along with subjectivity in assessment[7] is responsible for high intra- and interobserver variability observed in 2D US-based follicular assessment.[6] This has led to development of three-dimensional (3D) US-based software solutions, which provide an automated assessment of different follicular parameters.

The 3D US-based software solutions have shown to significantly reduce intra- and interobserver variability in follicular assessment, along with a significant reduction in total time required for assessment.[8] Although useful, US devices with 3D transvaginal probe and automated software are either not available or have a high cost,[9] which makes such solutions unfeasible for the resource-constrained countries; unfortunately, these are countries, which have the majority of the infertile couples.[2] Moreover, no statistically significant difference has been observed in the success rate of assisted reproduction treatment when a 3D method was used instead of 2D method.[8] Hence, a manual 2D US-based assessment of ovarian follicles still remains the method of choice worldwide. This makes it important to have intuitive solutions to help clinicians perform a better follicular assessment using conventional 2D method and hardware. Unfortunately, unlike 3D solutions, no systematic attempts have been made to develop 2D US-based solutions for automatic follicular assessment. Considering the need for such a solution, we have developed a novel 2D US-based solution, which provides a number of follicles and their sizes automatically on the images obtained by a manual 2D US sweep of the ovary. The primary objective of this study is to compare our automated solution with a manual 2D US-based assessment for measurement of follicle count and diameter on the follicles larger than 5 mm in diameter.

  Materials and Methods Top

This prospective observational study was split into two phases. In the first phase, 60 subjects were recruited from two centers – one general hospital and the other infertility institute from June 2014 to October 2015. The inclusion criteria were women aged 18 years or above who had been advised for infertility-related pelvic ultrasound scan (infertility screening or assisted reproduction treatment). All the subjects were treated by the established protocols of the respective institutes. For a given participant, after ultrasound-based assessments for follicular monitoring, one to five 2D US sweep recordings of both the ovaries were obtained and stored in a digital format. For the study, US scans from the 6th day (poststimulation) onward were used. The multiple follicular assessment US sweeps obtained from these subjects were used to develop and test the software (training and testing datasets) for automatic assessment of follicles' number and their sizes. In the second phase, 10 subjects were recruited from the same institutes. The 2D US sweep data from these subjects were used for a blind validation of the developed automated solution. The study was approved by the Institutional Review Board (IRB approval number: JSS/MC/IEC/831/2014-15), and informed consent was obtained from all the subjects.

Manual two-dimensional ultrasound-based follicular assessment

For all the participants, for each ovary, a total number of ovarian follicles and their sizes were recorded by the clinicians/sonologists using the conventional 2D US-based method. For this assessment, only follicles larger than 5 mm in diameter were considered. For each follicle, the plane where it looks the biggest and roundest was searched for. The two biggest diameters were then measured using manual calipers. The mean of these two diameters was computed and recorded. Philips ClearVue 550 system with C9-4v probe was used for this assessment. The manual 2D US-based assessment was considered as a ground truth for algorithm development and subsequent comparison.

Data acquisition for algorithm by two-dimensional ultrasound sweep

A sweep of each ovary was performed in two systematic ways: (1) from the lateral end to the medial end of the ovary (LM sweep) or indifferently the opposite (medially to laterally) or (2) from the anterior side of the ovary to the posterior side of the ovary, or the opposite AP or PA sweep. All the sweeps included a margin safety, i.e., a few images going beyond the ovary at the beginning and at the end of the sweep to be sure that the set of images contains the whole ovary. For each ovary, four to five sweeps were collected at each US scan, including two to three using the LM sweep, and two to three the AP sweep. The series of images obtained were recorded using the cineloop mode of a Philips ClearVue (650 and 850) system using a C9-4v transvaginal probe (4 - 9 MHz). The cineloop time for each sweep was fixed at 10 s.

Assessment of the stored two-dimensional ultrasound sweeps by independent experts

Two independent experts who were blind to the result of manual 2D US-based assessment and clinical history were asked to review all the recoded US sweep and assess follicles' number and diameters. An annotation tool was used for this purpose, which allowed experts to review each US sweep frame by frame for the assessment.

Algorithm for automated follicular assessment

The proposed algorithm is based on a region-growing approach for image segmentation. The algorithm segments ovarian follicles based on their unique geometrical and statistical properties. The acquired 2D US sweep images are first subjected to preprocessing, which comprises two steps: contrast enhancement and de-noising. The contrast enhancement is used to increase the contrast between follicular and nonfollicular regions. This is followed by image normalization to enhance the contrast between the follicular regions and to highlight boundaries between them. However, the contrast enhancement also amplifies noise, and hence, the intensity normalization step is followed by a de-noising procedure to mitigate the effects of noise.

The preprocessed images are then subjected to the iterative region-growing method. Region-growing routines are a class of seed-based image segmentation algorithms where pixels in a neighborhood are successively added to the current segment till a specific image intensity convergence criterion is met. The iterations involve computation of the shape of the follicle by imposing a constraint on the shape of the regions and the stability of the shape over iterations. A shape constraint is used on the segmented regions to prevent oversegmentation of the follicles. The plot of the shape parameter over the iterations is analyzed to identify the optimal point to stop region growing. For the follicle segmented using this approach, the major and minor axis lengths are computed using the best-fitted ellipse method. The average measurement of these two axes is then considered as the mean diameter of the follicle. Based on all the follicles identified by this approach, the total follicle count is computed. Similarly, for all the identified follicles, their mean diameter is also provided. [Figure 1] shows the different stages of image processing used by the algorithms for automated follicular assessment.
Figure 1: Image processing steps in the automated follicular assessment; (a) Original image; (b) Contrast-enhanced image; (c) de-noised image; (d) outline of a segmented follicle; (e and f) measurement of different follicular axis

Click here to view

Statistical analysis

Manual 2D US-based assessment done by a clinician was considered as a ground truth for all comparisons. The distribution of the data was first analyzed using Shapiro–Wilk test. Based on distribution, Student's t-test or Wilcoxon's signed-rank test was used to compare total number of follicles detected in each US sweep by the manual assessment and the automated solution. In manual method, only follicles larger or equal to 5 mm in mean diameter were recorded, whereas the automated solution was able to detect follicles smaller than 5 mm in diameter; therefore, for comparison of count, only those follicles which were estimated to be larger than 5 mm in diameter by algorithm were considered. The correlation between the two methods was determined using Pearson's or Spearman's rank correlation coefficient. The mean follicular diameter determined by the two methods was also compared similarly.

The limits of agreement between the two methods for follicle counts and diameters were assessed by the Bland–Altman method. The Bland–Altman method is considered as a gold standard for method comparison studies and has been extensively used to compare different methods of follicular assessment.[8] The algorithm results were also compared with two independent experts' assessment using the same methodology. For all comparisons, P < 0.05 was considered to denote a statistically significant difference. All statistical analyses were performed using R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria and MATLAB®.

  Results Top

In total, 60 subjects were recruited in the phase one. Out of this, six subjects' data were rejected due to bad image quality (incomplete sweeps of the ovaries or poor image quality, i.e., when no follicle was properly visible). The data from the remaining 54 subjects were used for algorithm development. Of these, 29 participants were undergoing intrauterine insemination (IUI) and 25 were undergoing in vitro fertilization/fecundation with or without intracytoplasmic sperm injection (IVF/ICSI).

A total of ten subjects were enrolled in the second phase. Five participants from this group were taking IUI treatment, whereas the remaining five were undergoing IVF/ICSI treatment. In the phase two, a total of 86 2D US scans were performed on participants for follicular monitoring over the course of their menstrual cycles. A total of 251 US sweep recordings were obtained from these assessments; out these, 17 recordings were discarded due to bad image quality; the remaining 234 recordings were used for final analysis. The clinicians recorded a total of 1431 follicles in these manual scans with a mean follicular diameter in the range from 5 to 20 mm.

Comparison between algorithm and manual two-dimensional ultrasound-based assessment

Manual assessment of each ovary was performed by a clinician in real-time using 2D US scan; only follicles larger than 5 mm were considered. Each ovary was recorded to have an average 6.35 (±3.21) follicles (median = 6; interquartile range = 4–9). The average number of follicles (larger than 5 mm) detected by the algorithm was 6.33 (±3.80) follicles (median = 6; interquartile range = 4–8). The mean follicular diameter by manual method was 10.74 (±3.64) mm (median = 10.5 mm; interquartile range = 8–13 mm). The mean follicular diameter estimated by the algorithm was 9.01 (±3.44) mm (median = 8.31 mm; interquartile range = 6.42–11.11 mm). Both follicle counts and diameters were not normally distributed.

No statistically significant difference was observed in total follicle count between the algorithm and manual 2D assessment by Wilcoxon's signed-rank test. The two methods had an excellent correlation, with Spearman's rank correlation coefficient of 0.787. The 95% limits of agreement between the two methods were 4.232 for the upper limit and −4.258 for the lower limit.

The two methods were found to have a statistically significant difference in measurement of mean follicular diameter, with the algorithm underestimating mean diameter by an average of −1.725 mm (±2.16 mm). However, the two methods had an excellent correlation for mean follicular diameter measurement (Spearman's coefficient = 0.817). The Bland–Altman plots for the limits of agreement with 95% confidence intervals for the two methods are presented in [Figure 2]. The upper limit of agreement between the two methods was 2.508 mm, whereas the lower limit of agreement was −5.960 mm. All comparison-related results are summarized in [Table 1] and [Table 2].
Figure 2: Bland–Altman plot of the limits of agreement between automated solution and two-dimensional ultrasound-based manual assessment for measurement of follicle diameter

Click here to view
Table 1: Comparison of algorithm's result for total follicular count with the other methods

Click here to view
Table 2: Comparison of mean follicular diameter estimated by algorithm with the other methods

Click here to view

Comparison between the algorithm and two independent experts

The two independent experts performed follicular assessments on the recorded 2D US sweeps. No statistically significant difference was observed in total follicle count between the algorithm and expert-1's assessment with an excellent correlation coefficient of 0.784. The algorithm also had an excellent correlation with the expert-2 (0.765), but there was a statistically significant difference (P = 0.00014) in total follicular count, with the expert-2 being able to detect on an average 0.6 more follicles than the algorithm. In fact, it was further observed that the expert-2 was able to detect significantly more follicles than manual 2D assessment (P < 0.001) and the expert-1 (P < 0.001). The limits of agreements of the algorithm with two experts for follicular count are summarized in [Table 1].

The algorithm had an excellent correlation with both the experts for the mean follicular diameter measurement [Table 2]. However, there was a statistically significant difference in mean diameter measurement, with the algorithm having a tendency to underestimate the mean diameter in comparison to the two experts; the difference was more prominent for the expert-2 with mean difference of − 2.92 mm (±2.22). It was observed that expert-2 had a statistically significant difference in mean diameter measurement in comparison to 2D manual assessment and expert-1, with expert-2 having a general tendency to overestimate the mean follicular diameter.

  Discussion Top

Infertility is a significant problem worldwide, and factors such as delayed conception, pollution, environmental, and lifestyle changes are further likely to make it complicated. The modern ART relies heavily on ultrasound-based monitoring for infertility treatment. The 2D US-based manual assessment is the most preferred method for follicular monitoring worldwide, although it is known to have high intra- and interobserver variability. We have developed a novel software solution to make conventional 2D US-based follicular assessment objective and fast. The purpose of this study was to present the validation results of our solution on a blind data set. We observed that it was feasible to use our software solution for automatic assessment of follicle count and measurement with accuracy comparable to the real-time 2D US-based manual assessment.

For the total follicle count, an excellent correlation was observed between our software solution and 2D US-based manual assessment. Although not statistically significant, the algorithm had a tendency to underestimate total follicle count (−0.012) in comparison to the 2D manual assessment. The same trend has been observed with 3D US-based automated solutions as well.[10],[11],[12] The limits of agreements observed with our algorithm are within the intra- and interobserver limits of agreements reported for total follicle count by manual 2D US-based method in the literature.[10],[13] This provides an indication that our software solution is a reliable alternative to the conventional 2D method with a better accuracy.

Our algorithm was found to have an excellent correlation with the 2D manual assessment for measurement of follicular diameter as well. We further observed that the limits of agreements between our algorithm and the manual method are within the interobserver limits of agreements reported in the literature for the manual method of mean follicular diameter.[14] However, the algorithm had a tendency to underestimate mean follicular diameter in comparison to the 2D manual assessment. We postulate that two principle factors could be contributing to it: the first group of factors is related to how the algorithm works, whereas the second factor group is related to the way in which follicular diameter is measured in a convential practice.

For follicular detection, we have used seed-based region-growing image segmentation algorithms. This algorithm detects a follicle in an iterative process starting from a small hypoechogenic region as a seed and then growing its border in outward directions. To prevent an overestimation of a follicle's size, the iterative process is restricted within the follicle's border; this may lead to an underestimation of the follicle's diameter. The other algorithmic factor is related to the heterogeneous aspects of follicles: some follicles may contain echoic regions within their boundaries and these regions cannot be detected automatically by the algorithm. Another factor is related to the limitation of measuring all follicles in a single ovarian sweep where each follicle might not be visible in the right plane, i.e., where it presents its biggest mean diameter.

The other important factor for underestimation of follicle size is related to the way follicular diameter is measured in a conventional practice. Measurement of mean follicular diameter using conventional 2D US-based method has been associated high intra- and interobserver variability due to lack of a consensus on standard protocols.[5] The placement of measurement calipers on US image is also an important factor in high subjectivity in assessment of follicular diameter. We observed that during measurement of diameter, clinicians have a tendency to put the calipers slightly outside of the follicular borders; this is mostly done as a safety margin so as not to miss any follicular part. This might lead to a systematic overestimation in follicle size by the manual method. We observed that 3D US-based automated software solutions also have a similar tendency to underestimate mean follicular diameter in comparisons to manual 2D US-based methods.[14],[15] This supports our hypothesis and demonstrates reliability of our solution for measurement of mean follicular diameter.

Apart from the reduction in intra- and interobserver variability, another important advantage of automated software solution is a significant reduction in time required for follicular assessment.[10],[11],[12],[14],[16] In the present study, we did not measure the time required for the manual 2D US-based assessment. It has been reported in the literature that the mean time required for such assessment ranges from 56.8 s to 9.6 min with a median of 314.4 s.[8] For our solution, the cineloop time (recording time) for each sweep was fixed at 10 s. The time taken by our algorithm was in the range of 30 to 60 s (based on the number of follicles) for automatic assessment of follicular count and measurement in a US sweep. Considering this, we believe that our software solution can bring a significant time saving for follicular assessment. As a future work, we would like to confirm these initial results and the potential examination of time gain in an integrated system (ultrasound device with the automation software solution).

A small sample size is an important limitation of our study. We have tried to compensate the small sample size by obtaining multiple US sweeps from each participant, which provided us a total of 234 sweeps with more than 1431 follicles of different sizes. The other limitation is regarding follicle size; in the present study, we tested our algorithm results only on follicles larger than 5 mm in diameter. We are also exploring possibility of incorporating postprocessing options to allow clinicians to manually add missed follicles and correct measurements.

  Conclusion Top

This study validates the reliability and performance of our automated solution for follicle count and measurement using 2D US sweeps. We observed that our solution's performance is better than known intra- and interobserver variability of the manual 2D US-based assessment. We believe that this solution could be very helpful in reducing measurement variability during follicular assessment and can make conventional 2D US-based monitoring more objective and much faster. We recommend further validation of these solutions with well-designed multicenter studies.


We would like to acknowledge Chandan Mishra and Devdatt Kawathekar from Philips Ultrasound Business for their support in the project. We would also like to thank Dr. Ambarisha Bhandiwad and the staff of JSS hospital, Mysore, as well as the staff of Gunasheela Hospital and the researchers from Philips Research who were involved in the early phases of the project.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.

  References Top

Sun H, Gong TT, Jiang YT, Zhang S, Zhao YH, Wu QJ. Global, regional, and national prevalence and disability-adjusted life-years for infertility in 195 countries and territories, 1990-2017: Results from a global burden of disease study, 2017. Aging (Albany NY) 2019;11:10952-91.  Back to cited text no. 1
Mascarenhas MN, Flaxman SR, Boerma T, Vanderpoel S, Stevens GA. National, regional, and global trends in infertility prevalence since 1990: A systematic analysis of 277 health surveys. PLoS Med 2012;9:e1001356.  Back to cited text no. 2
Ata B, Seyhan A, Reinblatt SL, Shalom-Paz E, Krishnamurthy S, Tan SL. Comparison of automated and manual follicle monitoring in an unrestricted population of 100 women undergoing controlled ovarian stimulation for IVF. Hum Reprod 2011;26:127-33.  Back to cited text no. 3
Raine-Fenning N, Jayaprakasan K, Clewes J, Joergner I, Bonaki SD, Chamberlain S, et al. SonoAVC: A novel method of automatic volume calculation. Ultrasound Obstet Gynecol 2008;31:691-6.  Back to cited text no. 4
Duijkers IJ, Louwé LA, Braat DD, Klipping C. One, two or three: How many directions are useful in transvaginal ultrasound measurement of ovarian follicles? Eur J Obstet Gynecol Reprod Biol 2004;117:60-3.  Back to cited text no. 5
Forman RG, Robinson J, Yudkin P, Egan D, Reynolds K, Barlow DH. What is the true follicular diameter: An assessment of the reproducibility of transvaginal ultrasound monitoring in stimulated cycles. Fertil Steril 1991;56:989-92.  Back to cited text no. 6
Murtinger M, Zech MH, Dietmar S, Zech NH. Outpatient follicle monitoring: A plea for standardization in ultrasound based follicle monitoring and data transfer. J Reprod Infertil 2014;15:105-8.  Back to cited text no. 7
Vandekerckhove F, Bracke V, De Sutter P. The Value of Automated Follicle Volume Measurements in IVF/ICSI. Front Surg 2014;1:18.  Back to cited text no. 8
Peres Fagundes PA, Chapon R, Olsen PR, Schuster AK, Mattia MM, Cunha-Filho JS. Evaluation of three-dimensional SonoAVC ultrasound for antral follicle count in infertile women: Its agreement with conventional two-dimensional ultrasound and serum levels of anti-Müllerian hormone. Reprod Biol Endocrinol 2017;15:96.  Back to cited text no. 9
Deb S, Jayaprakasan K, Campbell BK, Clewes JS, Johnson IR, Raine-Fenning NJ. Intraobserver and interobserver reliability of automated antral follicle counts made using three-dimensional ultrasound and SonoAVC. Ultrasound Obstet Gynecol 2009;33:477-83.  Back to cited text no. 10
Raine-Fenning N, Jayaprakasan K, Deb S, Clewes J, Joergner I, Dehghani Bonaki S, et al. Automated follicle tracking improves measurement reliability in patients undergoing ovarian stimulation. Reprod Biomed Online 2009;18:658-63.  Back to cited text no. 11
Deb S, Campbell BK, Clewes JS, Raine-Fenning NJ. Quantitative analysis of antral follicle number and size: A comparison of two-dimensional and automated three-dimensional ultrasound techniques. Ultrasound Obstet Gynecol 2010;35:354-60.  Back to cited text no. 12
Scheffer GJ, Broekmans FJ, Bancsi LF, Habbema JD, Looman CW, Te Velde ER. Quantitative transvaginal two- and three-dimensional sonography of the ovaries: Reproducibility of antral follicle counts. Ultrasound Obstet Gynecol 2002;20:270-5.  Back to cited text no. 13
Raine-Fenning N, Jayaprakasan K, Chamberlain S, Devlin L, Priddle H, Johnson I. Automated measurements of follicle diameter: A chance to standardize? Fertil Steril. 2009;91:1469-72.  Back to cited text no. 14
Pan P, Chen X, Li Y, Zhang Q, Zhao X, Bodombossou-Djobo MM, et al. Comparison of manual and automated measurements of monodominant follicle diameter with different follicle size in infertile patients. PLoS One 2013;8:e77095.  Back to cited text no. 15
Deutch TD, Joergner I, Matson DO, Oehninger S, Bocca S, Hoenigmann D, et al. Automated assessment of ovarian follicles using a novel three-dimensional ultrasound software. Fertil Steril 2009;92:1562-8.  Back to cited text no. 16

Correspondence Address:
Celine Firtion,
Philips Research India, Philips Innovation Campus, Manyata Tech-Park, Bengaluru, Karnataka
Login to access the Email id

Source of Support: None, Conflict of Interest: None


  [Figure 1], [Figure 2]

  [Table 1], [Table 2]


   Ahead Of Print
 Download PDF Version
     Search Pubmed for
    -  Firtion C
    -  Ramachandran G
    -  Nellur Prakash SP
    -  Hiwale S
    -  Vajinepalli P
    -  Manyam I
    -  Gunasheela D

Materials and Me...
Article Figures
Article Tables

 Article Access Statistics
    PDF Downloaded10    

Recommend this journal