diff --git a/docs/tutorials/Clustering.ipynb b/docs/tutorials/Clustering.ipynb index 6863cf42..238b3855 100644 --- a/docs/tutorials/Clustering.ipynb +++ b/docs/tutorials/Clustering.ipynb @@ -286,7 +286,7 @@ "## Understanding key parameters\n", "\n", "- Determining an appropriate threshold for cutoff\n", - " - Butina uses distances (which is 1 - distance) and the cutoff is dependent on the distance metric used. As mentioned earlier, Datamol uses Tanimoto with ECFP fingerprint. Therefore the distance cutoff is 1 - Tanimoto.\n", + " - Butina uses distances (which is 1 - similarity) and the cutoff is dependent on the distance metric used. As mentioned earlier, Datamol uses Tanimoto with ECFP fingerprint. Therefore the distance cutoff is 1 - Tanimoto.\n", " - Generally speaking, if you have a very small distance cutoff, compounds must be extremely similar (i.e. high Tanimoto score) in order to be grouped into one cluster. Therefore, with a small distance cutoff, you’ll get more clusters with fewer compounds per cluster. Vice versa is true.\n", "\n", "**Note:** This is an extremely general overview, in reality, the output greatly depends on both the size and diversity of the dataset being used. There is no “default” cutoff that is set in Datamol and instead, each user should set cutoffs according to their specific dataset and use case. \n",