Itty-bitty bytes of code and mini molecules of future drugs are coming together in a mother lode of data for scientists to examine in a project led by IBM and three other companies in coordination with the National Institutes of Health.

Collaborating with Bristol-Myers Squibb, DuPont, and Pfizer, IBM will announce Thursday that it is providing a database of more than 2.4 million chemical compounds extracted from about 4.7 million patents and 11 million biomedical journal abstracts from 1976 to 2000.

The new collection will be merged into a database at the National Center for Biotechnology Information (NCBI) at the National Institutes of Health in Washington. That database, called PubChem, is open to the public on the Internet.

"We're thrilled that IBM did this, because it adds important value," said NCBI director David Lipman. He added that the site already gets about 15,000 users per day and various companies and organizations download 15 terabytes of information per day.

Lipman said the key aspect is that the program pulls together information - whether previously secret or simply on paper somewhere - in a digital form that can be sorted and sifted.

IBM is after profits, too, so it is simultaneously rolling out a fee- and cloud-based computer program called SIIP that it hopes corporate researchers will pay to use in their pursuit of the next billion-dollar blockbuster drug or chemical compound. Imaging technology, for example, can scan a diagram of molecular structure on a printed page and digitize it for examination.

IBM's global life sciences leader, Chris Moore, said the tools will help companies and their researchers keep track of whether competitors are infringing on their patents and identify unmet needs where profits might be had.

"This tool enables organizations to see what science has been done, to be a starting point, to help in collaboration and in understanding the landscape of the market," Moore said.

Moore declined to give costs for the service but said IBM has enough variation in rates that would enable small start-ups to purchase the service.

"IBM is one of the few companies that spends as much on R&D as life sciences companies themselves," Moore said. Referring to the project, he said: "It's complicated and global, all of which is appealing to me personally and the company as a whole."

Contact staff writer David Sell at dsell@phillynews.com or 215-854-4506. Read his blog at www.philly.com/phillypharma and on Twitter @phillypharma.