DOI: 10.1073/pnas.80.3.726. Rapid similarity searches of nucleic acid and protein data banks.

Rapid similarity searches of nucleic acid and protein data banks.

10.1073/pnas.80.3.726

Crossref journal-article

Proceedings of the National Academy of Sciences

Proceedings of the National Academy of Sciences (341)

Abstract

With the development of large data banks of protein and nucleic acid sequences, the need for efficient methods of searching such banks for sequences similar to a given sequence has become evident. We present an algorithm for the global comparison of sequences based on matching k-tuples of sequence elements for a fixed k. The method results in substantial reduction in the time required to search a data bank when compared with prior techniques of similarity analysis, with minimal loss in sensitivity. The algorithm has also been adapted, in a separate implementation, to produce rigorous sequence alignments. Currently, using the DEC KL-10 system, we can compare all sequences in the entire Protein Data Bank of the National Biomedical Research Foundation with a 350-residue query sequence in less than 3 min and carry out a similar analysis with a 500-base query sequence against all eukaryotic sequences in the Los Alamos Nucleic Acid Data Base in less than 2 min.

Bibliography

Wilbur, W. J., & Lipman, D. J. (1983). Rapid similarity searches of nucleic acid and protein data banks. Proceedings of the National Academy of Sciences, 80(3), 726â730.

Authors 2

W J Wilbur (first)
D J Lipman (additional)

References 0 Referenced 838

None

Dates

Type	When
Created	19 years, 3 months ago (May 31, 2006, 3:24 a.m.)
Deposited	3 years, 4 months ago (April 13, 2022, 11:43 a.m.)
Indexed	2 days, 23 hours ago (Aug. 30, 2025, 12:49 p.m.)
Issued	42 years, 7 months ago (Feb. 1, 1983)
Published	42 years, 7 months ago (Feb. 1, 1983)
Published Online	42 years, 7 months ago (Feb. 1, 1983)
Published Print	42 years, 7 months ago (Feb. 1, 1983)

Funders 0

None

BibTeX

@article{Wilbur_1983, title={Rapid similarity searches of nucleic acid and protein data banks.}, volume={80}, ISSN={1091-6490}, url={http://dx.doi.org/10.1073/pnas.80.3.726}, DOI={10.1073/pnas.80.3.726}, number={3}, journal={Proceedings of the National Academy of Sciences}, publisher={Proceedings of the National Academy of Sciences}, author={Wilbur, W J and Lipman, D J}, year={1983}, month=feb, pages={726–730} }

JSON

{
  "indexed": {
    "date-parts": [
      [
        2025,
        8,
        30
      ]
    ],
    "date-time": "2025-08-30T16:49:14Z",
    "timestamp": 1756572554878
  },
  "reference-count": 0,
  "publisher": "Proceedings of the National Academy of Sciences",
  "issue": "3",
  "content-domain": {
    "domain": [
      "www.pnas.org"
    ],
    "crossmark-restriction": true
  },
  "published-print": {
    "date-parts": [
      [
        1983,
        2
      ]
    ]
  },
  "abstract": "<jats:p>With the development of large data banks of protein and nucleic acid sequences, the need for efficient methods of searching such banks for sequences similar to a given sequence has become evident. We present an algorithm for the global comparison of sequences based on matching k-tuples of sequence elements for a fixed k. The method results in substantial reduction in the time required to search a data bank when compared with prior techniques of similarity analysis, with minimal loss in sensitivity. The algorithm has also been adapted, in a separate implementation, to produce rigorous sequence alignments. Currently, using the DEC KL-10 system, we can compare all sequences in the entire Protein Data Bank of the National Biomedical Research Foundation with a 350-residue query sequence in less than 3 min and carry out a similar analysis with a 500-base query sequence against all eukaryotic sequences in the Los Alamos Nucleic Acid Data Base in less than 2 min.</jats:p>",
  "DOI": "10.1073/pnas.80.3.726",
  "type": "journal-article",
  "created": {
    "date-parts": [
      [
        2006,
        5,
        31
      ]
    ],
    "date-time": "2006-05-31T07:24:18Z",
    "timestamp": 1149060258000
  },
  "page": "726-730",
  "update-policy": "http://dx.doi.org/10.1073/pnas.cm10313",
  "source": "Crossref",
  "is-referenced-by-count": 838,
  "title": "Rapid similarity searches of nucleic acid and protein data banks.",
  "prefix": "10.1073",
  "volume": "80",
  "author": [
    {
      "given": "W J",
      "family": "Wilbur",
      "sequence": "first",
      "affiliation": []
    },
    {
      "given": "D J",
      "family": "Lipman",
      "sequence": "additional",
      "affiliation": []
    }
  ],
  "member": "341",
  "published-online": {
    "date-parts": [
      [
        1983,
        2
      ]
    ]
  },
  "container-title": "Proceedings of the National Academy of Sciences",
  "original-title": [],
  "language": "en",
  "link": [
    {
      "URL": "https://pnas.org/doi/pdf/10.1073/pnas.80.3.726",
      "content-type": "unspecified",
      "content-version": "vor",
      "intended-application": "similarity-checking"
    }
  ],
  "deposited": {
    "date-parts": [
      [
        2022,
        4,
        13
      ]
    ],
    "date-time": "2022-04-13T15:43:59Z",
    "timestamp": 1649864639000
  },
  "score": 1,
  "resource": {
    "primary": {
      "URL": "https://pnas.org/doi/full/10.1073/pnas.80.3.726"
    }
  },
  "subtitle": [],
  "short-title": [],
  "issued": {
    "date-parts": [
      [
        1983,
        2
      ]
    ]
  },
  "references-count": 0,
  "journal-issue": {
    "issue": "3",
    "published-print": {
      "date-parts": [
        [
          1983,
          2
        ]
      ]
    }
  },
  "alternative-id": [
    "10.1073/pnas.80.3.726"
  ],
  "URL": "http://dx.doi.org/10.1073/pnas.80.3.726",
  "relation": {},
  "ISSN": [
    "0027-8424",
    "1091-6490"
  ],
  "subject": [],
  "container-title-short": "Proc. Natl. Acad. Sci. U.S.A.",
  "published": {
    "date-parts": [
      [
        1983,
        2
      ]
    ]
  },
  "assertion": [
    {
      "value": "1983-02-01",
      "order": 2,
      "name": "published",
      "label": "Published",
      "group": {
        "name": "publication_history",
        "label": "Publication History"
      }
    }
  ]
}