<?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE article PUBLIC "-//NLM/DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
    <!--<?xml-stylesheet type="text/xsl" href="article.xsl">-->
<article xmlns:ns0="http://www.w3.org/1999/xlink" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.2" xml:lang="en">
	<front>
		<journal-meta>
			<journal-id journal-id-type="issn">2303-9868</journal-id>
			<journal-id journal-id-type="eissn">2227-6017</journal-id>
			<journal-title-group>
				<journal-title>International Research Journal</journal-title>
			</journal-title-group>
			<issn pub-type="epub">2303-9868</issn>
			<publisher>
				<publisher-name>Cifra LLC</publisher-name>
			</publisher>
		</journal-meta>
		<article-meta>
			<article-id pub-id-type="doi">10.60797/IRJ.2026.165.74</article-id>
			<article-categories>
				<subj-group>
					<subject>Brief communication</subject>
				</subj-group>
			</article-categories>
			<title-group>
				<article-title>APPLYING THE K-SHAPE CLUSTERING ALGORITHM TO SPORTS TACTICS ANALYSIS: A USE-CASE IN FINSWIMMING</article-title>
			</title-group>
			<contrib-group>
				<contrib contrib-type="author" corresp="yes">
					<name>
						<surname>Bogdashkin</surname>
						<given-names>Aleksandr Yegorovich</given-names>
					</name>
					<email>swim.alex@mail.ru</email>
					<xref ref-type="aff" rid="aff-1">1</xref>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Saurov</surname>
						<given-names>Yevgenii Alekseevich</given-names>
					</name>
					<email>saurov.ev@gmail.com</email>
					<xref ref-type="aff" rid="aff-1">1</xref>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Morozov</surname>
						<given-names>Sergei Nikolaevich</given-names>
					</name>
					<email>morozov750@mail.ru</email>
					<xref ref-type="aff" rid="aff-1">1</xref>
				</contrib>
			</contrib-group>
			<aff id="aff-1">
				<label>1</label>
				<institution>Russian University of Sports (SCOLIPE)</institution>
			</aff>
			<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2026-03-17">
				<day>17</day>
				<month>03</month>
				<year>2026</year>
			</pub-date>
			<pub-date pub-type="collection">
				<year>2026</year>
			</pub-date>
			<volume>4</volume>
			<issue>165</issue>
			<fpage>1</fpage>
			<lpage>4</lpage>
			<history>
				<date date-type="received" iso-8601-date="2023-12-29">
					<day>29</day>
					<month>12</month>
					<year>2023</year>
				</date>
				<date date-type="accepted" iso-8601-date="2024-01-12">
					<day>12</day>
					<month>01</month>
					<year>2024</year>
				</date>
			</history>
			<permissions>
				<copyright-statement>Copyright: &amp;#x00A9; 2022 The Author(s)</copyright-statement>
				<copyright-year>2022</copyright-year>
				<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
					<license-p>
						This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See 
						<uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>
					</license-p>
					.
				</license>
			</permissions>
			<self-uri xlink:href="https://research-journal.org/archive/3-165-2026-march/10.60797/IRJ.2026.165.74"/>
			<abstract>
				<p>This article provides the effectiveness of the application of the k-Shape clustering algorithm to the tactics analysis in endurance sports, such as swimming, skiing, cycling, etc. To popularize the sport of finswimming, this study is conducted on one of the events of this sport. The most and the least efficient tactical schemes for 400 meters distances in finswimming for men and women were found. To achieve that, we apply an effective and modern clustering algorithm named k-Shape. The cross-correlation measure is used as a distance measure between event splits organized as time series. Our study shows that the most productive tactics at 400 meters disciplines in finswimming represent even pacing strategy with fast 1st lap, gradual slowdown from 2nd to 3rd lap, constant speed from 4th to 7th and speed-up at the 8th lap.</p>
			</abstract>
			<kwd-group>
				<kwd>finswimming</kwd>
				<kwd> k-shape</kwd>
				<kwd> clustering</kwd>
				<kwd> sport results</kwd>
				<kwd> tactics</kwd>
				<kwd> competitive activity</kwd>
			</kwd-group>
		</article-meta>
	</front>
	<body>
		<sec>
			<title>HTML-content</title>
			<p>1. Introduction</p>
			<p>Tactics is an essential part of the training process and competitions at any level. Different groups of sport arts have different tactical characteristics [1], [4]. Endurance sport arts have time-trial [9] competitions and represent repetitions of similar actions during a race. Competition rules restrict the variety of technical and tactical actions. Therefore, it may misrepresent the absence of necessity of the integral preparation for competitions. Unfortunately, the sports research community don’t investigate this topic enough.</p>
			<p>Tactics in endurance sport arts (skating, running, finswimming, swimming, cycling and others) is mostly considered as pacing [11]. Athletes aim to choose the most suitable race pace based on their individual skills and theoretical knowledge. Analysis of tactics in this group of sports might be seen as the analysis of time series [8]: athletes repetitively execute standardized laps several times during a competition race which makes it convenient to find similar patterns in the data. </p>
			<p>The most appropriate way to analyze tactics in cyclic sport is to use diverse methods of applied statistics. The discriminant method is the most popular way in research of this kind [7]. This method assumes distribution of data into formerly created classes. This method has many disadvantages: results largely rely on researcher’s opinion and on the rules that they create the classes with; it is very time consuming; it is not scalable [6].</p>
			<p>Swimming tactics pieces of research are also usually done with the discriminant method. For example, a group of British scientists [10] studied tactics in 400 meters freestyle, comparing tactical schemes with six standard options described in another study [1]: parabolic, fast-start-even, positive, negative, even, variable. The comparison showed that elite swimmers used only three out of six schemes: parabolic, fast-start-even and positive, while the athletes preferred tactical schemes with a quick start and a parabolic change in speed, but none of the tactical schemes had a significant impact on the final result [10]. Disadvantages of this study are the same: it is highly subjective, time consuming, and carries high risk of producing incorrect data.</p>
			<p>A more appropriate and scientific way of analyzing time series is clustering. Studies of this kind in finswimming were not found.</p>
			<p>2. Research methods and principles</p>
			<p>2.1. Clustering</p>
			<p>Clustering aims to find homogeneous groups in unlabeled data [3]. Over the past decades, a lot of different approaches for measuring similarities were introduced. Among those, we can name shape-based, feature-based and model-based approaches [8]. In the shape-based approach, shapes of two time-series traces are matched as well as possible by a non-linear stretching, e. g. utilizing the DTW distance. In the feature-based method, raw data are transformed into a lower-dimensional representation. Some examples of this transformation might be value-based principal component analysis [13] or singular value decomposition [6]. In model-based approaches, raw data are used to train a model. Then, conventional clustering algorithms are applied to the residuals of the fitted model [14].</p>
			<p>In this study, we leverage one of the shape-based clustering algorithms, and namely k-Shape [12]. In comparison to other approaches, this algorithm offers a scalable and interpretable solution to the time series clustering problem. Indeed, k-Shape has time complexity O(max(nm2, m3)), but a significantly better accuracy in comparison to k-Means [8].</p>
			<p>2.2. Frameworks</p>
			<p>All parts of this software were developed in Python 3.7.8. For data preprocessing and scaling (MinMaxScaler), pandas 1.4.2, numpy 1.21.0 and scikit-learn 1.1 were used. For clustering, a built-int implementation of k-Shape of tslearn 0.5.2 was utilized. All experiments were run on a laptop with a &quot;Intel Core i7 CPU @2.8 GHz&quot; processor and 16 GB RAM.</p>
			<p>3. Main results</p>
			<p>We analyzed 594 results at 400 meters surface, bifins and immersion disciplines for men and women at top world competitions from 2010 to 2021 years during this research. Time for 50 meters laps were calculated for every performance. We performed k-Shape clustering for k = 2, 3, 4, and 5. We found that k = 2 and 3 gave us the most distinct shapes for different disciplines. Each performance was assigned a number of points using the ranking system developed earlier [2]. All the performances were ranged into four groups divided by three quantiles. Productivity of every tactical scheme was evaluated by the percentage of its results that are in the range higher than 0,75 quantile (table 1).</p>
			<table-wrap id="T1">
				<label>Table 1</label>
				<caption>
					<p>Frequency of use of different tactical schemes at 400 meters disciplines in finswimming</p>
				</caption>
				<table>
					<tr>
						<td>№</td>
						<td>Number of points</td>
						<td>Overall</td>
					</tr>
					<tr>
						<td>0,75 quantile</td>
						<td>0,75-0,5 quantile</td>
						<td>0,5-0,25 quantile</td>
						<td>0,25 quantile</td>
					</tr>
					<tr>
						<td>n</td>
						<td>%</td>
						<td>n</td>
						<td>%</td>
						<td>n</td>
						<td>%</td>
						<td>n</td>
						<td>%</td>
						<td>n</td>
						<td>%</td>
					</tr>
					<tr>
						<td>400 meters surface men</td>
					</tr>
					<tr>
						<td>1</td>
						<td>27</td>
						<td>32,93</td>
						<td>24</td>
						<td>29,27</td>
						<td>19</td>
						<td>23,17</td>
						<td>12</td>
						<td>14,63</td>
						<td>82</td>
						<td>57,75</td>
					</tr>
					<tr>
						<td>2</td>
						<td>8</td>
						<td>14,81</td>
						<td>9</td>
						<td>16,67</td>
						<td>14</td>
						<td>25,93</td>
						<td>23</td>
						<td>42,59</td>
						<td>54</td>
						<td>38,03</td>
					</tr>
					<tr>
						<td>3</td>
						<td>0</td>
						<td>0</td>
						<td>3</td>
						<td>50</td>
						<td>2</td>
						<td>33,33</td>
						<td>1</td>
						<td>16,67</td>
						<td>6</td>
						<td>4,23</td>
					</tr>
					<tr>
						<td>400 meters surface women</td>
					</tr>
					<tr>
						<td>1</td>
						<td>30</td>
						<td>25</td>
						<td>29</td>
						<td>24,17</td>
						<td>31</td>
						<td>25,83</td>
						<td>30</td>
						<td>25</td>
						<td>120</td>
						<td>88,24</td>
					</tr>
					<tr>
						<td>2</td>
						<td>4</td>
						<td>25</td>
						<td>5</td>
						<td>31,25</td>
						<td>3</td>
						<td>18,75</td>
						<td>4</td>
						<td>25</td>
						<td>16</td>
						<td>11,76</td>
					</tr>
					<tr>
						<td>400 meters bifins men</td>
					</tr>
					<tr>
						<td>1</td>
						<td>11</td>
						<td>27,5</td>
						<td>10</td>
						<td>25</td>
						<td>10</td>
						<td>25</td>
						<td>9</td>
						<td>22,5</td>
						<td>40</td>
						<td>64,52</td>
					</tr>
					<tr>
						<td>2</td>
						<td>4</td>
						<td>18,18</td>
						<td>6</td>
						<td>27,27</td>
						<td>5</td>
						<td>22,73</td>
						<td>5</td>
						<td>22,73</td>
						<td>22</td>
						<td>34,48</td>
					</tr>
					<tr>
						<td>400 meters bifins women</td>
					</tr>
					<tr>
						<td>1</td>
						<td>12</td>
						<td>27,27</td>
						<td>10</td>
						<td>22,73</td>
						<td>11</td>
						<td>25</td>
						<td>11</td>
						<td>25</td>
						<td>44</td>
						<td>73,33</td>
					</tr>
					<tr>
						<td>2</td>
						<td>3</td>
						<td>18,75</td>
						<td>5</td>
						<td>31,25</td>
						<td>4</td>
						<td>25</td>
						<td>4</td>
						<td>25</td>
						<td>16</td>
						<td>26,67</td>
					</tr>
					<tr>
						<td>400 immersion men</td>
					</tr>
					<tr>
						<td>1</td>
						<td>20</td>
						<td>27,4</td>
						<td>21</td>
						<td>28,77</td>
						<td>16</td>
						<td>21,92</td>
						<td>16</td>
						<td>21,92</td>
						<td>73</td>
						<td>72,28</td>
					</tr>
					<tr>
						<td>2</td>
						<td>6</td>
						<td>21,43</td>
						<td>4</td>
						<td>14,29</td>
						<td>9</td>
						<td>31,14</td>
						<td>9</td>
						<td>31,14</td>
						<td>28</td>
						<td>27,72</td>
					</tr>
					<tr>
						<td>400 meters immersion women</td>
					</tr>
					<tr>
						<td>1</td>
						<td>9</td>
						<td>36</td>
						<td>3</td>
						<td>12</td>
						<td>4</td>
						<td>16</td>
						<td>9</td>
						<td>36</td>
						<td>25</td>
						<td>26,88</td>
					</tr>
					<tr>
						<td>2</td>
						<td>3</td>
						<td>17,65</td>
						<td>6</td>
						<td>35,29</td>
						<td>4</td>
						<td>23,53</td>
						<td>4</td>
						<td>23,53</td>
						<td>17</td>
						<td>18,28</td>
					</tr>
					<tr>
						<td>3</td>
						<td>12</td>
						<td>23,53</td>
						<td>14</td>
						<td>27,45</td>
						<td>15</td>
						<td>29,41</td>
						<td>10</td>
						<td>19,61</td>
						<td>51</td>
						<td>54,84</td>
					</tr>
				</table>
			</table-wrap>
			<p>It was found that the most productive tactical schemes represent parabolic pacing strategy (picture 1). All the athletes swam the 1st lap faster than others because of the speed-up after jump off the blocks, the big amount of phosphocreatine and glycogen in the muscles and high level of adrenaline in the blood due to the pre-competition stress. At the 2nd to 3rd lap the phosphocreatine level is very low while glycogen level is decreasing which leads to a gradual slowdown. Starting from the 4th lap the less powerful aerobic pathways take a bigger role in the ATP resynthesis which causes a slower but even speed. At the last 50 meters athletes go &quot;all-in&quot; and try to spend the rest of their energy. The least productive schemes represent constant slowdown from start to finish or significant slowdown at 7th lap and speed-up at 8th lap (picture 2). For the better visualization comparison, all the results were scaled from 0 to 1.</p>
			<fig id="F1">
				<label>Figure 1</label>
				<caption>
					<p>The most productive tactical schemes at 400 meters distances for men and women</p>
				</caption>
				<alt-text>The most productive tactical schemes at 400 meters distances for men and women</alt-text>
				<graphic ns0:href="/media/images/2026-03-16/3bbd62fc-65be-4b9f-8b10-1e10bebdd39c.png"/>
			</fig>
			<fig id="F2">
				<label>Figure 2</label>
				<caption>
					<p>The least productive tactical schemes at 400 meters distances for men and women</p>
				</caption>
				<alt-text>The least productive tactical schemes at 400 meters distances for men and women</alt-text>
				<graphic ns0:href="/media/images/2026-03-16/befe7cde-d712-4675-9014-96cb994fe2d5.png"/>
			</fig>
			<p>4. Discussion</p>
			<p>In this study, different pacing strategies were extracted by the applying a state-of-the-art data-driven approach. The similar shape-based groups were found by the utilization of k-Shape algorithm. As shown in the literature, this approach is used for natural signals in finance, medicine, biology and many other fields [13]. However, to the best of our knowledge, this approach has never been applied to the sports tactics analysis. It should be admitted that the new approach proved to be superior to the previously applied methods such as the discriminant method, which might be characterized as subjective and strongly affected by the initialization. These drawbacks have been overcome by the state-of-the-art data-driven algorithm of k-Shape clustering. However, this algorithm was only applied to middle-distance and, certainly, can be extended and used for long distances. Unfortunately, this method is not suitable for sprint distances. The future pieces of research should consider expansion of clustering algorithms for short distances. This will be the objective of our future studies.</p>
			<p>5. Conclusion</p>
			<p>K-Shape clustering algorithm is proven to be an appropriate and modern method to analyze tactic patterns in cyclic sports. This method allows to avoid subjectivity and provides more scientific and valid data than others. The most productive tactics at 400 meters disciplines in finswimming represent even pacing strategy with fast 1st lap, gradual slowdown from 2nd to 3rd lap, constant speed from 4th to 7th and speed-up at the 8th lap.</p>
		</sec>
		<sec sec-type="supplementary-material">
			<title>Additional File</title>
			<p>The additional file for this article can be found as follows:</p>
			<supplementary-material xmlns:xlink="http://www.w3.org/1999/xlink" id="S1" xlink:href="https://doi.org/10.5334/cpsy.78.s1">
				<!--[<inline-supplementary-material xlink:title="local_file" xlink:href="https://research-journal.org/media/articles/10659.docx">10659.docx</inline-supplementary-material>]-->
				<!--[<inline-supplementary-material xlink:title="local_file" xlink:href="https://research-journal.org/media/articles/10659.pdf">10659.pdf</inline-supplementary-material>]-->
				<label>Online Supplementary Material</label>
				<caption>
					<p>
						Further description of analytic pipeline and patient demographic information. DOI:
						<italic>
							<uri>https://doi.org/10.60797/IRJ.2026.165.74</uri>
						</italic>
					</p>
				</caption>
			</supplementary-material>
		</sec>
	</body>
	<back>
		<ack>
			<title>Acknowledgements</title>
			<p/>
		</ack>
		<sec>
			<title>Competing Interests</title>
			<p/>
		</sec>
		<ref-list>
			<ref id="B1">
				<label>1</label>
				<mixed-citation publication-type="confproc">Abbiss C. Describing and understanding pacing strategies during athletic competition / C. Abbiss, P. Laursen — 2008. — № 38. — p. 239–252.</mixed-citation>
			</ref>
			<ref id="B2">
				<label>2</label>
				<mixed-citation publication-type="confproc">Bogdashkin A. Development of a ranking system in finswimming as a pedagogical tool for the comparison of athletic performance / A. Bogdashkin, S. Morozov, E. Saurov — 2022. — № 119. — p. 27–30.</mixed-citation>
			</ref>
			<ref id="B3">
				<label>3</label>
				<mixed-citation publication-type="confproc">Chung F.L. Flexible time series pattern matching based on perceptually important points / F.L. Chung — 2001. — № 1. — p. 1–7.</mixed-citation>
			</ref>
			<ref id="B4">
				<label>4</label>
				<mixed-citation publication-type="confproc">De Koning J.J. Using modeling to understand how athletes in different disciplines solve the same problem: swimming versus running versus speed skating / J.J. De Koning, C. Foster, A. Lucia — 2011. — № 6. — p. 276–280.</mixed-citation>
			</ref>
			<ref id="B5">
				<label>5</label>
				<mixed-citation publication-type="confproc">Faloutsos C. Fast subsequence matching in time-series databases / C. Faloutsos, M. Ranganathan — 1994. — № 1. — p. 419–129.</mixed-citation>
			</ref>
			<ref id="B6">
				<label>6</label>
				<mixed-citation publication-type="confproc">Filippone M.F. A survey of kernel and spectral methods for clustering / M.F. Filippone, F. Camastra, F. Masulli — 2008. — № 41 (1). — p. 176–190.</mixed-citation>
			</ref>
			<ref id="B7">
				<label>7</label>
				<mixed-citation publication-type="confproc">Kaufman L. Finding groups in data: An introduction to cluster analysis / L. Kaufman, P.J. Rousseeuw — New York: John Wiley &amp;amp; Sons, 2009. — 342 p.</mixed-citation>
			</ref>
			<ref id="B8">
				<label>8</label>
				<mixed-citation publication-type="confproc">Meesrikamolku W. . Shape-based clustering for time series data / W. Meesrikamolku, V. Niennattrakul — 2012. — № 3. — p. 530–541.</mixed-citation>
			</ref>
			<ref id="B9">
				<label>9</label>
				<mixed-citation publication-type="confproc">Konings M.J. Pacing Decision Making in Sport and the Effects of Interpersonal Competition: A Critical Review / M.J. Konings, F.J. Hettinga — 2018. — № 48 (11). — p. 1829–1843.</mixed-citation>
			</ref>
			<ref id="B10">
				<label>10</label>
				<mixed-citation publication-type="confproc">Mauger A.R. Of pacing strategy selection in elite 400-m freestyle swimming / A.R. Mauger, J. Neuloh, P.C. Castle — 2012. — № 44 (11). — p. 2205–2212.</mixed-citation>
			</ref>
			<ref id="B11">
				<label>11</label>
				<mixed-citation publication-type="confproc">McGibbon K.E. Pacing in swimming : A systematic review / K.E. McGibbon, D.P. Pyne, M.E. Shephard, K.G. Thompson — 2018. — № 48 (7). — p. 1621–1633.</mixed-citation>
			</ref>
			<ref id="B12">
				<label>12</label>
				<mixed-citation publication-type="confproc">Paparrizos J. K-shape: Efficient and accurate clustering of time series / J. Paparrizos, L. Gravano — 2015. — № 1. — p. 1855-1870.</mixed-citation>
			</ref>
			<ref id="B13">
				<label>13</label>
				<mixed-citation publication-type="confproc">Shumway R.H.R. Time-frequency clustering and discriminant analysis / R.H.R. Shumway — 2003. — № 3. — p. 307–314.</mixed-citation>
			</ref>
			<ref id="B14">
				<label>14</label>
				<mixed-citation publication-type="confproc">Li X. Linear time complexity time series clustering with symbolic pattern forest / X. Li — 2019. — № 4. — p. 190–205.</mixed-citation>
			</ref>
		</ref-list>
	</back>
	<fundings/>
</article>