The links deposited in LinkProt database are divided into three categories:

In each category, the links are divided into topological classes and by the number of components which constitute the link. Each topological class is then subdivided into more exact subclasses taking into account also the orientation and chirality of the link. The topological classification of each link type is described in this section. Currently the database collects information about links made of: Among the complete protein structures (or when missing part of the protein chain does not change type of the topology) deposited in PDB we found 20 different types of links.Deterministic links are formed by covalent loops, closed by disulfide bridge. Such links can be inter- or intra-molecule (involving, respectively, one or more chains). The first one-chain links were identified in [1]. The only two-chain deterministic link was described in [2]. The LinkProt database automatically classifies both types of links. Currently, one- and two-chain deterministic links have been found.

The one- and two-chain deterministic links are classified into two main topological classes - Hopf () and Solomon () links. As the protein chain is naturally oriented from the N- to the C-terminus, the links formed by covalent loops are equipped with an inherited orientation. This, orientation in principle, splits the topological types into smaller, topologically distinct subgroups. In particular, in the case of one-chain links, one can distinguish three topological types: **positive Hopf link**, **negative Hopf link**, and **negative Solomon link** (Fig. 1). The Hopf link and the Solomon links are prime links found in the mathematical enumeration of links, while the composition (or connected sum) of the Solomon link and the Hopf link is not shown in the enumeration since the tables only list prime knots/links.

On each covalent loop, a minimal surface is spanned (Fig. 2). This allows for the analysis of potentially biologically important amino acid residues actually piercing the spanning surface. Moreover, the chain orientation determines the surface orientation, and therefore the piercing direction. As a result, the chain orientation allows us also to prescribe a sign for Hopf and Solomon links - the signs are adjusted to match the sign of the piercings of the surface (determined as in [3]) spanning on the covalent loops and the other oriented component. For example, in case of the negative Hopf link, there is one piercing through each surface, and both piercings are negative.

Two-chain deterministic links are exceptionally rare. They consist of two topological classes - the positive Hopf link and the positive Solomon link. The two-chain positive Hopf link is seen in citrate sythase [2] (Fig. 3). The low number of two-component deterministic links indicates that the formation of such structures is less probable. No three-chain deterministic link have been found to date.

The two-chain deterministic links are analyzed in the same manner as one-chain links. In particular the LinkProt database assigns a minimal surface to each covalent loop and calculates the surface piercings (Fig. 3). A rotating depiction of the protein can be viewed at the top of the page.

In probabilistic links, the link components are formed by entire chains, closed at infinity (as in the case of defining the closure for knotted proteins [4]), described in greater detail in the link detection section. Because of the definition, such links can only be formed by two or more chains. Since, in principle, the link type depends on the closure direction, the link type is calculated for many closure directions and the total likelihood variation of the associated topological link types is presented for each protein. Moreover, the user has the possibility to define the cut-off likelihood for which the structures will be displayed. In fact, there are only a few structures for which the likelihood of link formation is greater than 60%, and all of these structures are Hopf link structures (either positive or negative). With decreasing the likelihood cut-off, more structures and more topological motifs are present. In particular, for a likelihood cut-off of 10%, in addition to the Hopf link models, there are also Solomon links, the David star, and two-eights-link proteins. We would like to stress that for a low likelihood cut-off, some proteins may be present simultaneously in different topological groups. For example, the Poliovirus late rna-release intermediate with PDB code 3IYC has the likelihood of Solomon link equal ca. 15% (so it would be displayed in Solomon link group for a cut-off lower than 15%), but it is also likely to be seen as a Hopf-link protein (the likelihood equal ca 34%). The default likelihood cut-off is set at 30%. The most common two-chain probabilistic links are presented in Tab. 1. Similar to the deterministic links, the LinkProt server also takes into account the orientation of the chains, which splits the possible topological type into subtypes. Currently, the LinkProt uses the HOMFLY-PT polynomial to identify all prime two-component links up to 14 crossings in their minimal crossing representation and most component links up to 8 crossings. Including three- and four-chain structures, polynomials for almost 24.000 structures were calculated. Nevertheless, for some structures, there occur topological classes named "other" reflecting the presence of chains that form exceptionally complicated links. Such structures will be progressively identified and added to the database.

All the two-chain links with probability greater or equall to 10% are presented in Tab. 1. Similarly to the "Search" page, clicking on each link class will display subclasses of two-component links identified so far. Below the table there is a list of other two-chain links identified, which were not assinged 10% in any case. For each link the relevant HOMFLY-PT polynomial is shown.

In the case of **three chain** structures, additional links are possible. These are mainly composition of two Hopf links (one ring joining two others - ), the link (), in which each two components form the Hopf link, and a Hopf link with a split circle (). Such structures are presented in Tab. 2. All the probabilistic links are treated in the same manner as deterministic links for each chain termini connection. In particular, for each termini connection, the LinkProt database assigns a (triangulated) minimal surface and counts its piercings (Fig. 4).

All the three-chain links with probability greater or equall to 10% are presented in Tab. 2. Similarly to the "Search" page, clicking on each link class will display subclasses of three-component links identified so far. Below the table there is a list of other three-chain links identified, which were not assinged 10% in any case. For each link the relevant HOMFLY-PT polynomial is shown.

All the three-chain links with probability greater or equall to 10% are presented in Tab. 2. Similarly to the "Search" page, clicking on each link class will display subclasses of three-component links identified so far. Below the table there is a list of other three-chain links identified, which were not assinged 10% in any case. For each link the relevant HOMFLY-PT polynomial is shown.

The **four-component links** are builed mainly as a split sum of two- or three-component links with a suitable number of rings (Hopf U 2 Rings, Hopf # Hopf U Ring or Solomon # Hopf U Ring). Nevertheless, in the case of Crystal structure of human apolipoprotein a-i there is 20% probability of a four-component link with at least 12 crossing in minimal corssing representation. As such links were not named officially, we call that particular link (distinguished by its HOMFLY-PT polynomial) L12n1. We however do not claim, that it is 12 crossing link (although it seems so).

The possible four-component links with at least 10% probability are given in the Tab. 3. In case of four-component links, there is however much higher dependence of link type on the direction of termini extensio n. Therefore the Table of other found links, with probability less than 10% can be also useful. Such table, with relevant HOMFLY-PT polynomials is given below Tab. 3.

In the structure of Bordetella bacteriophage BBP-1 (PDB code: 3J4U) there is a "chainmail" structure that was first identified in [5]. In the head of BBP-1, there are major capsid proteins (MCP) and 'cement proteins' (CP) that decorate the shell. However, there was no clear description of how the proteins in the head of BBP-1 are linked. We fill this gap by classifying these links.

The macromolecular links are formed by closed loops which are defined by connecting chains in the macromolecular structure. Currently, LinkProt contains 4 types of Macromolecular links in MCP. In each of the components, the orientation is clock-wise, and the closure of the components is made by connecting chains in the natural order of the amino acids:

- The simplest example is a Hopf link, which has a positive sign if we take into account the order of the amino acids (the sign depends on the arrangements of the components). Neighboring polygons (these can be pentagons or hexagons) form Hopf links.
- When we go one level higher, i.e. considering three component links, the link with 6 crossings is formed (found also for three chain probabilistic links). It is distinguished from other links with 3 components links with six crossing by the fact that it is non-alternating - as one follows a component of the link, the crossings do not always alternate between over and under crossings. Each pair of the components is a positive Hopf link.

- the central component being a pentagon - it has 20 crossings,
- the central component being a hexagon - it has 24 crossings.

Below we present the tables containing links with probability equal or greater then 10 for two-, three-, and four-component links. For each link number of distinct subtypes (taking into account the chirality and orientation) both theoretically **possible** and **observed** (with at least 10% probability) is given. The difference of these numbers is the number of **missing** subclasses. Clicking on each row sho
ws observed subclasses with their representative structures and HOMFLY-PT polynomial. Below each table one can click on the **"Show observed structures with less then 10% probability"** to find examples of other links stored in LinkProt database.

Link name |
Image |
Possible topologically different structures |
Observed topologically different structures |
Missing structures |

Hopf |
2 | 2 | 0 | |

Solomon |
4 | 4 | 0 | |

Star of David |
4 | 1 | 3 | |

Trefoil U Ring |
2 | 2 | 0 | |

Cinquefoil U Ring |
2 | 1 | 1 |

Link name |
Image |
Possible topologically different structures |
Observed topologically different structures |
Missing structures |

Hopf U Ring |
2 | 2 | 0 | |

Solomon U Ring |
4 | 2 | 2 | |

4 | 3 | 1 | ||

Trefoil U 2 Rings |
2 | 1 | 1 | |

Hopf # Hopf |
3 | 3 | 0 |

Link name |
Image |
Possible topologically different structures |
Observed topologically different structures |
Missing structures |

Hopf U 2 Rings |
2 | 1 | 1 | |

Hopf # Hopf U Ring |
3 | 1 | 2 | |

Solomon # Hopf U Ring |
8 | 1 | 7 | |

L12n1 |
6 | 1 | 5 |

Link type |
Image |
HOMFLY-PT polynomial |
Example |

Flower(5) |
--- | Fragment of HK97 bacteriophage chainmail structure | |

Flower(6) |
--- | Fragment of HK97 bacteriophage chainmail structure |

[1] Dabrowski-Tumanski, P., Niemyska, W., & Sulkowska , J. I.

[2] Boutz, D. R., Cascio, D., Whitelegge, J., Perry, L. J., & Yeates, T. O. (2007).

[3] Dabrowski-Tumanski, P., Niemyska, W., Pasznik, P., & Sulkowska, J. I. (2016).

[4] Sulkowska, J. I., Rawdon, E. J., Millett, K. C., Onuchic, J. N., & Stasiak, A. (2012).

[5] Zhang, X., Guo, H., Jin, L., Czornyj, E., Hodes, A., Hui, W. H., ... & Zhou, Z. H. (2013).

LinkProt | Interdisciplinary Laboratory of Biological Systems Modelling