US20070026397A1 - Method for producing second-generation library - Google Patents

Method for producing second-generation library Download PDF

Info

Publication number
US20070026397A1
US20070026397A1 US10/546,538 US54653804A US2007026397A1 US 20070026397 A1 US20070026397 A1 US 20070026397A1 US 54653804 A US54653804 A US 54653804A US 2007026397 A1 US2007026397 A1 US 2007026397A1
Authority
US
United States
Prior art keywords
library
nucleic acid
codon
identifier
codons
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/546,538
Inventor
Per-Ola Freskgard
Alex Gouliaev
Thomas Thisted
Eva Olsen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nuevolution AS
Original Assignee
Nuevolution AS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nuevolution AS filed Critical Nuevolution AS
Priority to US10/546,538 priority Critical patent/US20070026397A1/en
Publication of US20070026397A1 publication Critical patent/US20070026397A1/en
Assigned to NUEVOLUTION A/S reassignment NUEVOLUTION A/S ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GOULIAEV, ALEX HAAHR, OLSEN, EVA KAMPMANN, THISTED, THOMAS, FRESKGARD, PER-OLA
Priority to US13/179,283 priority patent/US9096951B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • C40B40/08Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J19/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J19/0046Sequential or parallel reactions, e.g. for the synthesis of polypeptides or polynucleotides; Apparatus and devices for combinatorial chemistry or for making molecular arrays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1068Template (nucleic acid) mediated chemical library synthesis, e.g. chemical and enzymatical DNA-templated organic molecule synthesis, libraries prepared by non ribosomal polypeptide synthesis [NRPS], DNA/RNA-polymerase mediated polypeptide synthesis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6811Selection methods for production or design of target specific oligonucleotides or binding molecules
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00592Split-and-pool, mix-and-divide processes
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/04Methods of creating libraries, e.g. combinatorial synthesis using dynamic combinatorial chemistry techniques
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/08Liquid phase synthesis, i.e. wherein all library building blocks are in liquid phase or in solution during library creation; Particular methods of cleavage from the liquid support
    • C40B50/10Liquid phase synthesis, i.e. wherein all library building blocks are in liquid phase or in solution during library creation; Particular methods of cleavage from the liquid support involving encoding steps

Definitions

  • the invention relates to a method for producing a second-generation compound library with an improved desired property profile.
  • the parent genotype is carried on to the off-spring and results in a phenotype in which the exact type and sequence of amino acids is retained, unless a mutation and/or recombination has occurred.
  • the present method only retains the identity of chemical entities, e.g. amino acids, while the sequence wholly or partly is scrambled. The result is a focused second-generation library with lower diversity.
  • the biological evolution is based on the survival of specific genotypes that encode phenotypes with the most suitable functionalities in a certain environment.
  • DNA serves two important functions in the natural selection process. One function is obviously to encode for the type of nucleotides used and the other function is to encode for the specific order of nucleotide sequences in a nucleic acid sequence.
  • the strategy used in nature i.e. encoding for the exact type as well as the precise sequence of nucleotides, ensures an extremely similarity between the progeny and its parents. Thus, conserving almost the exact sequence and type of the nucleotides is absolutely essential in order to create off spring with a high functionality.
  • the changes in the genotype from one generation to another, which allow for evolution, are determined by the random mutation rate and recombination between the two parent's genotypes.
  • the method of nature is used.
  • the identifier nucleic acid sequences (genotype) carries the information from one generation to the next.
  • WO 93/03172 A1 discloses a method for identifying a polypeptide ligand having a desired property in a polypeptide library.
  • a translatable mRNA mixture is provided, which is mixed with a mixture of ribosome complexes to form a translation product attached to the mRNA strand responsible for the formation thereof.
  • the ribosome complexes binding to a target are partitioned from and remainder of the library.
  • an amplification of mRNA strands of the partitioned ribosome complexes, which has bound to the target follows.
  • the amplified mRNA strands are used for the production of a second generation library, which is subjected to a renewed contact with the target.
  • the method is repeated a sufficient number of times until the size of the library has narrowed to a small pool of high affinity binders.
  • WO98/31700 A1 a method for selecting a DNA molecule, which encodes for a desired protein, is disclosed.
  • the method implies the initial presence of a pool of candidates RNA molecules, which subsequently is translated into a corresponding pool of RNA-protein fusions.
  • the mRNA-protein fusion products are subjected to a selection process, i.e. the fusion products are presented for a target molecule, and a new pool of complexes capable of binding to the target are partitioned. From the new pool of complexes, the mRNAs are recovered and amplified for use in a subsequent round of library generation.
  • Xu, L. et al Chemistry & Biology , Vol. 9, 933-942, August 2002 discloses a practical embodiment in which a library of more than 10 12 unique mRNA-protein fusion products through ten rounds of library generation and selection are used to identify a high affinity binding protein.
  • WO 00/23458A1 libraries of complexes comprising non-natural molecules attached to corresponding nucleic acid sequences are suggested. After a selection of the library has been conducted, the nucleic acid sequences of successful complexes are amplified by PCR and a new library is prepared from these nucleic acid sequences. The same method of carrying information from an initial library to the next library is applied in WO 02/074929A2 and WO 02/103008A2.
  • the present invention provides a new method for evolving encoded molecules.
  • the method is based on the identification of chemical entities used in the synthesis of reaction products of successful complexes and the application, at least in part, of these chemical entities in the preparation of the next generation library.
  • the utilization of preferable chemical entities and the exclusion of certain undesired chemical entities in the next library generation generally imply that the next generation library has a smaller size compared to the size of the initial library, thereby, at the same time, retaining the desirable encoded molecules in the library.
  • the present invention concerns a method for producing a composition of molecules with an improved desired property, said method comprising the steps of: providing an initial library comprising a plurality of different encoded molecules associated with a corresponding identifier nucleic acid sequence, wherein each encoded molecule comprises a reaction product of multiple chemical entities and the identifier nucleic acid sequence comprises codons identifying said chemical entities; subjecting the initial library to a condition partitioning members having encoded molecules displaying a predetermined property from the remainder of the initial library; identifying codons of the identifier nucleic acid sequences of the partitioned members of the initial library; and preparing a second-generation library of encoded molecules using the chemical entities coded for by the codons of the partitioned members of the initial library or a part thereof.
  • the present invention relates to a novel approach to perform evolution of molecules with a desired property, said approach being different from the approach of nature and the prior art.
  • the invention is based on the selecting of chemical entities, the counterpart of amino acids in Nature, instead of the precise sequence of chemical entities. This new approach is powerful in ex vivo conditions when high functionality of the off spring is not vital for success and when the number of chemical entities relative to the number of reactants used in each encoded molecule is high.
  • the method disclosed herein will be increasingly effective as the library size increases. This is due to the fact that more chemical entities is used when a library size is increased, when the number of reactions for the formation of the encoded is fixed and the fact that different chemical entities tend to be involved in encoded molecules having the desired property.
  • the chemical entities which are part of the final selected molecules, will be enriched in each round of selection. Finally, when the diversity has been extensively reduced, the enriched molecules are decoded from the identifier nucleic acid sequence comprising the codons of the chemical entities that have participated in the formation of the encoded molecule.
  • the strategy of performing enrichment of chemical entities instead of specific combinations of chemical entities more efficiently search the chemical space for all combinations of chemical entities that are eager to show a certain property, such as a binding ability towards a target.
  • chemical entities having a certain impact on the formation of encoded molecules is allowed in a new library to recombine in each new library generation.
  • the recombination is random, i.e. once a chemical entity has qualified as being of interest it is allowed in every position of the reaction sequence.
  • the recombination is semi-random, i.e. once a chemical entity is qualified as being of interest it is used in a certain position in the reaction sequence of the encoded molecule.
  • the amount of the chemical entity used in a subsequent library generation is dependent on the frequency and the amount of the partitioned library members.
  • the present invention may be of special interest when a group of chemical entities are selected from a larger pool of chemical entities in the formation of a first library. Selecting chemical entities resulting in encoded molecules having a certain property in a first library and spiking with remaining chemical entities of the pool allows for the formation of a second-generation library not necessarily of a smaller size but enriched in encoded molecules having a certain property.
  • the second-generation library may be formed of a reaction product of the chemical entities without attaching the reaction product to a nucleic acid.
  • the individual reaction products are formed in discrete reaction compartments in accordance with traditional combi-chem technology.
  • the second-generation library is prepared as the first generation library, i.e. the second-generation library comprises a plurality of different encoded molecules associated with a corresponding identifier nucleic acid sequence, wherein each encoded molecule comprises a reaction product of multiple chemical entities and the identifier nucleic acid sequence comprises codons identifying said chemical entities.
  • it comprises subjecting the second-generation library to a condition partitioning members having encoded molecules displaying a predetermined property from the remainder of the second-generation library.
  • the second-generation library may be partitioned as to the same property or a different property.
  • the second-generation library can be screening against the same target or a different target.
  • the invention comprises the step of deducing the identity of the encoded molecule(s) using the identifier nucleic acid sequence, when present.
  • a third or further generation library may be formed and screened before the final deducing step is performed.
  • the decoding includes that the codons of the identifier nucleic acid sequence is decoded to establish the synthesis history of the encoded molecules.
  • the synthesis history includes the identity of the chemical entities used and the point in time they enter the sequence of reactions resulting in the encoded molecule.
  • the encoded molecule is preferably a reaction product in which multiple chemical entity precursors have participated.
  • the encoded molecule may have any chemical structure.
  • the multiple chemical entities are precursors for a structural unit appearing in the encoded molecule.
  • the chemical entities may also perform a chemical reaction with the nascent encoded molecule, which result in an altering or removal of chemical groups.
  • the encoded molecule is a scaffolded molecule, i.e. various chemical entities have reacted with a chemical core structure like steroid, benzodiazepine, retinol, camphor, ephedrine, penicillin, cannabinol, coumarin, oxazol, etc.
  • the encoded molecule is fully or partly a polymer.
  • the polymer may be of a type which occurs naturally or may be a non-naturally occurring polymer. Nature only has the possibility of preparing ⁇ -polypeptides using the recognition of a codon of an mRNA strand by the anticodon of a charged tRNA.
  • the encoded molecule is not a ⁇ -peptide.
  • the chemical entities are reacted without enzymatic interaction to produce the encoded molecule.
  • the encoded molecule can be associated with the nucleic acid sequence identifier in any appropriate way.
  • the encoded molecule associated with the corresponding identifier nucleic acid sequence is a bifunctional complex.
  • the bifunctional complex may be formed by covalent or non-covalent attachment of the encoded molecule to the identifier nucleic acid sequence.
  • an identifier nucleic acid sequence is physically a distinct entity separated from the encoded molecule, wherein the identifier identifies the spatial position of an encoded molecule, e.g. in the same compartment in which an encoded molecule is formed a corresponding identifier oligonucleotide is generated.
  • the conditions partitioning complexes of interests from the remainder of the library may be chosen from a variety of possibilities.
  • the condition relates to physical parameters, so that complexes displaying a physical stability under e.g. certain temperature conditions, certain acidic conditions, certain radiation conditions etc. are selected from the library.
  • the condition for partitioning the desired complexes includes subjecting the initial library to a molecular target and partitioning complexes binding to this target.
  • the molecular target may be any compound of interest.
  • Exemplary targets are proteins, carbohydrates, polysaccharides, hormones, receptors, antibodies, viruses, antigens, cells, tissues etc.
  • the target is immobilized on a solid support, such as column material and contacted with the candidate complexes in a fluid media followed by a partitioning of the complexes capable of binding to the target under the contacting conditions used.
  • a solid support such as column material
  • the binding complexes are eluted from the column using increased stringency conditions.
  • the complexes as such or only the identifier part is harvested after the partitioning step.
  • the identifier nucleic acid sequences are amplified prior to the identification step.
  • the amplification is suitably performed applying polymerise chain reaction (PCR).
  • PCR polymerise chain reaction
  • the amplified identifiers may be explicitly or implicitly identified. When the codons are identified explicitly, the sequence and identity of nucleotides in the codon is made known to the experimenter, whereas, when the codons of the identifiers are implicitly identified, the experimenter is not presented for the information.
  • any suitable method for identifying codons may be used.
  • traditional sequencing e.g. by using a modification of the Sangers method or pyrosequencing methods, identifies the codons.
  • the codons of the identifier nucleic acid sequences of the partitioned members of the initial library are identified by contacting said identifier nucleic acid sequences with a pool of nucleic acid fragments under conditions allowing for hybridisation.
  • the pool of nucleic acid fragments may be immobilized or in solution.
  • the pool of nucleic acid fragments comprises a plurality of single stranded nucleic acid probes immobilized in discrete areas of a solid support, wherein the nucleic acid probes are capable of hybridising to a codon of the identifier nucleic acid sequence comprising codons.
  • the nucleic cid probes may be positioned on a microarray, such that the identity of the codons is revealed by observing the discrete areas of the support in which a hybridisation event has occurred.
  • the nucleic acid probe can be directly hybridised to the identifier or the nucleic acid probe of the array is hybridised to an identifier nucleic acid sequence through an adapter oligonucleotide having a sequence complementing the probe as well as one or more codons of the identifier nucleic acid sequence.
  • the probe may identify a single codon of an identifier or a probe of the array is capable of hybridising to two codons of the identifier nucleic acid sequence or a sequence complementary to said sequence. The ability to hybridise two or more codons makes it possible to study the influences of neighbouring chemical entities on each other.
  • a nucleic acid probe of the array is capable of hybridising to all codons of an identifier nucleic acid sequence. This latter option will fully decode the identity of the encoded molecule. Usually however, a fully decoding is only possible for a relative small library size, as it presupposes a nucleic acid probe for each member of the library.
  • useful information about a certain codon may be gathered by detecting the codon together with a framing sequence identifying the position in the reaction history of the chemical entity corresponding to said codon.
  • the library size is 10 8 .
  • 10 8 is in the excess of what is possible to detect on an array, especially if multiple determinations for each identifier are considered necessary to obtain a high accuracy.
  • an array of just 100 probes complementary to the 100 codons will reveal important information prior to or subsequent to a selection. In the event a framing sequence is detected together with the codon an array of 400 probes is needed.
  • a suitable method for identifying an hybridisation event is to use a label. Therefore, in a preferred embodiment, the existence of a hybridisation event is measured through labelling of the identifier nucleic acid sequence, or an amplification product thereof. When the label emits light, the hybridisation event is measured by the emission of light in a scanner. To reveal the relative abundance of each chemical entity in the library of encoded molecules, the relative intensity of light in each discrete spot is measured.
  • the measurement of a hybridisation event may be conducted by various methods known in the art.
  • the presence or absence of a hybridisation event may be measured in a scanner, e.g. a confocal scanner.
  • the scanner may be connected with computer software, which is able to quantify the amount of lights measured.
  • the amount of light measured correlates with the amount of identifier annealed to the probes.
  • the information can be used to design optimized libraries including chemical entities based on both the selection data and the chemical structure.
  • the microarray analysis will first of all detect which chemical entities pass the partitioning step. Secondly, the relative intensity on the microarray will reflect the relative binding affinity of the chemical entities. Finally, the structures of the chemical entities are directly identified due to the position of the probes on the array. For instance, chemical entities that are strongly selected in a partitioning process but possess some unfavourable chemical structure can be excluded in the next generation of library. Similarly, chemical entities that are weekly selected in a partitioning process but possess some favourable chemical structure can be included in the next generation of library. Thus, the next generation library design can be based both on a rational choice of chemical entities with lead-like structures and the selection pressure detected on the microarray.
  • nucleic acid fragments are primer oligonucleotides
  • the identification involves subjecting the hybridisation complex between the primer oligonucleotides and the identifier nucleic acid sequences to a condition allowing for an extension reaction to occur when the primer is sufficient complementary to a part of the identifier nucleic acid sequence, and evaluating based on measurement of the extension reaction, the presence, absence, or relative abundance of one or more codons.
  • the extension reaction requires a primer, a polymerase as well as a collection of deoxyribonucleotide triphosphates (abbreviated dNTP's herein) to proceed.
  • An extension product may be obtained in the event the primer is sufficient complementary to an identifier oligonucleotide for a polymerase to recognise the double helix as a substrate.
  • the deoxyribonucleotide triphosphates (blend of DATP, dCTP, dGTP, and dTTP) are incorporated into the extension product using the identifier oligonucleotide as identifier.
  • the conditions allowing for the extension reaction to occur usually includes a suitable buffer.
  • the buffer may be any aqueous or organic solvent or mixture of solvents in which the polymerase has a sufficient activity.
  • the polymerase and the mixture of dNTP's are generally included in a buffer which is added to the identifier oligonucleotide and primer mixture.
  • An exemplary kit comprising the polymerase and the nNTP's for performing the extension process comprises the following: 50 mM KCl; 10 mM Tris-HCl at pH 8.3; 1.5 mM MgCl2; 0.001% (wt/vol) gelatin, 200 ⁇ M DATP; 200 ⁇ M DTTP; 200 ⁇ M dCTP; 200 ⁇ M dGTP; and 2.5 units Thermus aquaticus (Taq) DNA polymerase I (U.S. Pat. No. 4,889,818) per 100 microliters ( ⁇ l) of buffer.
  • the primer may be selected to be complementary to one or more codons or parts of such codons.
  • the length of the primers may be determined by the length of the codons, however, the primers usually are at least about 11 nucleotides in length, more preferred at least 15 nucleotides in length to allow for an efficient extension by the polymerase.
  • the presence or absence of one or more codons is indicated by the presence of or absence of an extension product.
  • the extension product may be measured by any suitable method, such as size fractioning on an agarose gel and staining with ethidium bromide.
  • the admixture of identifier oligonucleotide and primer is termocycled to obtain a sufficient number of copies of the extension product.
  • the thermocycling is typically carried out by repeatedly increasing and decreasing the temperature of the mixture within a temperature range whose lower limit is about 30 degrees Celsius (30° C.) to about 55° C. and whose upper limit is about 90° C. to about 100° C.
  • the increasing and decreasing can be continuous, but is preferably phasic with time periods of relative temperature stability at each of temperatures favouring polynucleotide synthesis, denaturation and hybridization.
  • the result may be used to verify the presence or absence of a specific chemical entity during the formation of the display molecule.
  • the formation of an extension product is indicative of the presence of an oligonucleotide part complementary to the primer in the identifier oligonucleotide.
  • the absence of an extension product is indicative of the absence of an oligonucleotide part complementary to the primer in the identifier oligonucleotide. Selecting the sequence of the primer such that it is complementary to one or more codons will therefore provide information of the structure of the encoded molecule coded for by this codon(s).
  • a second primer complementary to a sequence of the extension product is included in the mixture of the identifier oligonucleotide and the primer oligonucleotide.
  • the second primer is also termed reverse primer and ensures an exponential increase of the number of produced extension products.
  • the method using a forward and reverse primer is well known to skilled person in the art and is generally referred to as polymerase chain reaction (abbreviated PCR) in the present application with claims.
  • the reverse primer is annealed to a part of the extension product downstream, i.e. near the 3′end of the extension product, or a part complementing the coding part of the identifier oligonucleotide.
  • the first primer anneals to an upstream position of the identifier oligonucleotide, preferably before the coding part, and the reverse primer anneals to a sequence of the extension product complementing one or more codons or parts thereof.
  • the amplicons resulting from the PCR process may be stained during or following the reaction to ease the detection.
  • a staining after the PCR process may be prepared with e.g. ethidium bromide or a similar staining agent.
  • amplicons from the PCR process is run on an agarose gel and subsequently stained with ethidium bromide. Under UV illumination bands of amplicons becomes visible. It is possible to incorporate the staining agent in the agarose gel or to allow a solution of the staining agent to migrate through the gel.
  • the amplicons may also be stained during the PCR process by an intercalating agent, like CYBR. In presence of the intercalating agent while the amplification proceeds it will incorporate in the double helix. The intercalation agent may then be made visible by irradiation by a suitable source.
  • the intensity of the staining is informative of the relative abundance of a specific amplicon.
  • a library of bifunctional complexes has been subjected to a selection the codons in the pool of identifier oligonucleotides which has been selected can be quantified using this method.
  • a sample of the selected identifier oligonucleotides is subjected to various PCR amplifications with different primers in separate compartments and the PCR product of each compartment is analysed by electrophoresis in the presence of ethidium bromide.
  • the bands that appear can be quantified by a densitometric analysis after irradiation by ultraviolet light and the relative abundance of the codons can be measured.
  • the primers may be labelled with a suitable small molecule, like biotin or digoxigenin.
  • a PCR-ELISA analysis may subsequently be performed based on the amplicons comprising the small molecule.
  • a preferred method includes the application of a solid support covered with streptavidin or avidin when biotin is used as label and anti-digoxigenin when digoxigenin is used as the label. Once captured, the amplicons can be detected using an enzyme-labelled avidin or anti-dixigenin reporter molecule similar to a standard ELISA format.
  • the extension process “real time”.
  • Several real time PCR processes has been developed and all the suitable real time PCR process available to the skilled person in the art can be used in the evaluating step of the present invention and are include in the present scope of protection.
  • the PCR reactions discussed below are of particular interest.
  • the monitoring of accumulating amplicons in real time has been made possible by labelling of primers, probes, or amplicons with fluorogenic molecules.
  • the real time PCR amplification is usually performed with a speed faster than the conventional PCR, mainly due to reduced cycles time and the use of sensitive methods for detection of emissions from the fluorogenic labels.
  • the most commonly used fluorogenic oligoprobes rely upon fluorescent resonance energy transfer (FRET) between fluorogenic labels or between one flourophor and a dark or “black-hole” nonfluorescent quencher (NFQ), which disperse energy as heat rather than fluorescence.
  • FRET is a spectroscopic process by which energy is passed between molecules separated by 10-100 ⁇ that have overlapping emission and absorption spectra.
  • An advantage of many real time PCR methods is that they can be carried out in a closed system, i.e. a system which does not need to be opened to examine the result of the PCR.
  • a closed system implies a reduced result turnaround, minimisaton of the potential for carry-over contamination and the ability to closely scrutinise the essay's performance.
  • the real time PCR methods currently available to the skilled person can be classified into either amplicon sequence specific or non-specific methods.
  • the basis for the non-specific detection methods is a DNA-binding fluorogenic molecule. Included in this class are the earliest and simplest approaches to real time PCR. Ethidium bromide, YO-PRO1, and SYBR® green 1 all fluorescence when associated with double stranded DNA which is exposed to a suitable wavelength of light. This approach requires the fluorescent agent to be present during the PCR process and provides for a real time detection of the fluorescent agent as it is incorporated into the double stranded helix.
  • the amplicons sequence specific methods includes, but are not limited to, the TaqMan®, hairpin, LightCycler®, Sunrise®, and Scorpions methods.
  • the LightCycler® method also designated “HybProbes” make use of a pair of adjacent, fluorogenic hybridisation oligonucleotide probes.
  • a first, usually the upstream oligoprobe is labelled with a 3′ donor fluorophore and the second, usually the downstream probe is commonly labelled with either a Light cycler Red 640 or Red 705 acceptor fluorophore a the 5′ terminus so that when both oligoprobes are hybridised the two fluorophores are located in close proximity, such as within 10 nm, of each other.
  • the close proximity provides for the emission of a fluorescence when irradiated with a suitable light source, such a blue diode in case of the LightCycler®.
  • a suitable light source such as a blue diode in case of the LightCycler®.
  • the region for annealing of the probes may be any suitable position that does not interfere with the primer annealing.
  • the site for binding the probes are positioned downstream of the codon region on the identifier oligonucleotide.
  • the region for annealing the probes may be at the 3′ end of the strand complementing the identifier oligonucleotide.
  • Another embodiment of the LightCycler method includes that the pair of oligonucleotide probes are annealed to one or more codons and primer sites exterior to the coding part of the identifier oligonucleotide are used for PCR amplification.
  • the TaqMan® method also referred to as the 5′ nuclease or hydrolysis method, requires an oligoprobe, which is attached to a reporter flourophor, such as 6-carboxy-fluoroscein, and a quencher fluorophore, such as 6-carboxy-tetramethyl-rhodamine, at each end.
  • a reporter flourophor such as 6-carboxy-fluoroscein
  • a quencher fluorophore such as 6-carboxy-tetramethyl-rhodamine
  • C T This threshold cycle (C T ) is defined as the PCR cycle in which the gain in fluorescence generated by the accumulating amplicons exceeds 10 standard deviations of the mean base line fluorescence.
  • the C T is proportional to the number of identifier oligonucleotide copies present in the sample.
  • the TaqMan probe is usually designed to hybridise at a position downstream of a primer binding site, be it a forward or a reverse primer.
  • the primer is designed to anneal to one or more codons of the identifier oligonucleotide, the presence of these one or more codons is indicated by the emittance of light.
  • the quantity of the identifier oligonucleotides comprising the one or more codons may be measured by the C T value.
  • the Hairpin method involves an oligoprobe, in which a fluorophore and a quencher are positioned at the termini.
  • the labels are hold in close proximity by distal stem regions of homologous base pairing deliberately designed to create a hairpin structure which result in quenching either by FRET or a direct energy transfer by a collisional mechanism due to the intimate proximity of the labels.
  • the quencher is usually different from the FRET mechanism, and is suitably 4-(4′-dimethylamino-phenylazo)-benzene (DAB-CYL).
  • the oligoprobe In the presence of a complementary sequence, usually downstream of a primer, or within the bounds of the primer binding sides in case of more than one a single primer, the oligoprobe will hybridise, shifting into an open configuration. The fluorophore is now spatially removed from the quencher's influence and fluorescence emissions are monitored during each cycle.
  • the hairpin probe may be designed to anneal to a codon in order to detect this codon if present on the identifier oligonucleotide.
  • This embodiment may be suitable if codons only differs from each other with a single or a few nucleotides, because is in well-known that the occurrence of a mismatch between a hairpin oligoprobe and its target sequence has a greater destabilising effect on the duplex than the introduction of an equivalent mismatch between the target oligonucleotide and a linear oligoprobe. This is probably because the hairpin structure provides a highly stable alternate conformation.
  • the Sunrise and Scorpion methods are similar in concept to the hairpin oligoprobe, except that the label becomes irreversible incorporated in to the PCR product.
  • the Sunrise method involves a primer (commercially available as AmplifluorTM hairpin primers) comprising a 5′ fluorophore and a quencher, e.g. DABCYL.
  • the labels are separated by complementary stretches of sequence that create a stem when the sunrise primer is closed.
  • a target specific primer sequence is a target specific primer sequence.
  • the target sequence is a codon, optionally more codons.
  • the sunrise primer's sequence is intended to be duplicated by the nascent complementary stand and, in this way, the stem is destabilised, the two fluorophores are held apart, usually between 15 and 25 nucleotides, and the fluorophore is free to emit its excitation energy for monitoring.
  • the Scorpion primer resembles the sunrise primer, but derivate in having a moiety that blocks duplication on the signalling portion of the scorpion primer.
  • the blocking moiety is typically hexethylene glycol.
  • the function of the scorpion primers differs slightly in that the 5′ region of the oligonucleotide is designed to hybridise to a complementary region within the amplicons.
  • the complementary region is a codon on the identifier oligonucleotide.
  • the codon profile is indicative of the chemical entities that have been used in the synthesis of encoded molecules having a certain property, such as an affinity towards a target.
  • a certain property such as an affinity towards a target.
  • the selection has been sufficient effective it may be possible directly to deduce a part or the entire structure of encoded molecules with the desired property. Alternatively, it may be possible to deduce a structural unit appearing more frequently among the encoded molecules after the selection, which gives important information to the structure-activity-relationship (SAR). If the selection process has not narrowed the size of the library to a manageable number, the formation of a second-generation library is useful.
  • the second-generation library chemical entities, which have not been involved in the synthesis of encoded molecules that have been successful in the selection may be omitted, thus limiting the size of the new library and at the same time increasing the concentration of complexes with the requested property, e.g. the ability to bind to a target.
  • the second-generation library may then be subjected to more stringent selection conditions to allow only the encoded molecules with a higher affinity to bind to the target.
  • the second-generation library may also be generated using the chemical entities coded for in addition to certain chemical entities suspected of increasing the performance of the final encoded molecule.
  • the indication of certain successful chemical entities may be obtained from the SAR.
  • the use in a second-generation library of chemical entities which have proved to be interesting for further investigation in a preceding library, may thus entail a shuffling with new chemical entities that may focus the second-generation library in a certain desired direction.
  • nucleic acid fragment is associated with a chemical entity precursor capable of being transferred to a recipient reactive group.
  • the recipient reactive group may be a part of a chemical scaffold and the chemical entity precursor may add a structural unit to said scaffold. It is preferred that the nucleic acid fragment codes for the chemical entity.
  • each member of the nucleic acid fragment pool comprises an anticodon, which identifies the chemical entity. When a plurality of chemical entities are present the anticodon is preferably unique, i.e. a unique correspondence between the chemical entities and the associated anticodons exists.
  • the identifier nucleic acid sequence comprises codons, which may be able to pair with one or more anticodons of the pool of nucleic acid fragments.
  • the pairing between one or more codons of an identifier nucleic acid sequence and one or more anticodons is preferably specific, i.e. the one or more codons of the identifier nucleic acid sequence are only recognized by particular anticodons.
  • the nucleic acid fragment containing more than one anticodon can encode for scaffold molecules where each anticodon encodes for specific chemical entities of that scaffold molecule. The specific pairing makes it possible implicitly to decode the codon of an identifier nucleic acid sequence.
  • non-specific pairing between codons and anticodons can be cleaved with an enzyme or chemically treated to break the double stranded nucleotides.
  • the non-pairing region can be cleaved using enzymes that cleaves specifically nucleotide sequences with mismatches.
  • the enzyme is selected from T4 endonuclease VII, T4 endonuclease I, CEL I, nuclease S1, or variants thereof.
  • the cleavage is preferable used when more than one codon and anticodon is involved in pairing between the identifier nucleic acid sequence and the nucleic acid fragment.
  • the pool of nucleic acid fragments associated with a chemical entity may comprise anticodons complemented by codons of one or more identifier nucleic acid sequence as well as anticodons which are not complemented by codons on any identifier nucleic acid sequence.
  • the amount of genetic information contained in the anticodons of the pool is larger than the amount of genetic information complemented by the codons.
  • the contacting of the one or more identifier nucleic acid sequences with the pool of nucleic acid fragments are usually conducted at conditions, which allow for hybridisation, i.e. conditions at which cognate nucleic acid sequences can anneal to each other.
  • the identifier nucleic acid sequences are usually immobilized on a solid support.
  • suitable solid supports include beads and column material, e.g. beads and column material associated with a second part of the affinity pair to bind identifier nucleic acid sequences attached to the first part of the molecular affinity pair.
  • the solid support is associated with streptavidin and the identifier nucleic acid sequences are attached to biotin.
  • the pool of nucleic acid fragments is typically present in a mobile phase, i.e. dissolved in a liquid.
  • the identifier nucleic acids will hybridise to these nucleic acid fragments in the pool which are sufficient complementary to a particular part of an identifier nucleic acid sequence for a binding to occur. Fragments not finding any complementing sequence will remain in the solution.
  • the identifier nucleic acid sequences are segregated into codons and the fragments comprises anticodons, the anticodons which are able to anneal to a codons will be caught while fragments not having a cognate codon will be maintained in the mobile phase.
  • codons and anticodons are present in the method of the present invention, specific hybridisation implies that the tendency of an anticodon to cross-hybridise to another codon will be impede or avoided.
  • codons may be designed such that each codon is distinguished from all other codons be one, two or more mismatching nucleotides.
  • the mobile phase is subsequently separated from the solid phase e.g. by washing, and the enriched pool of fragments is recovered.
  • the recovery of the nucleic acid fragments are usually done by subjecting the hybrid to denaturing conditions, i.e. conditions which separate the two strands. If the parent nucleic acid sequences are immobilized on beads, the separation of the fragments can be effected using denaturing conditions and centrifugation/spinning.
  • the enriched pool of nucleic acid fragments associated with a chemical entity may be used directly to prepare a next generation library of complexes, in which each member of the library comprises an encoded molecule and the nucleic acid sequence which codes for this molecule.
  • building blocks comprising a particular transferable chemical entity associated with an anticodon corresponding to the anticodons of the detected fragments are used in the generation of the next generation library.
  • additional building blocks are added having modified transferable chemical entities in order to improve on a certain property of the encoded molecule.
  • the complexes may be prepared by various known methods starting from the nucleic acid fragment comprising the anticodon and the chemical entity, as disclosed above.
  • the next generation library is formed by a) mixing under hybridisation conditions, nascent bifunctional complexes comprising a chemical entity or a reaction product of chemical entities, and an identifier nucleic acid sequence comprising codon(s) identifying said chemical entities, with the recovered nucleic acid fragments, said fragments comprising an oligonucleotide sufficient complementary to at least a part of the identifier nucleic acid sequence to allow for hybridisation, a transferable chemical entity and an anticodon identifying the chemical entity, to form hybridisation products; and b) transferring the chemical entities of the nucleic acid fragments to the nascent bifunctional complexes through a reaction involving a reactive group of the nascent bifunctional complex, in conjunction with a transfer of the genetic information of the anticodon.
  • the above method for preparing the next generation library comprises the further step of c) separating the components of the hybridisation product and recovering the complexes. If further chemical entities are intended to participate in the formation of the encoded molecule of the nascent complex, steps a) through c) are repeated as appropriate using the recovered complexes in step c) as the nascent bifunctional complexes in step a) of the next round.
  • the genetic information of the anticodon may be transferred to the nascent complex by a variety of methods. According to a first embodiment the genetic information of the anticodon is transferred by enzymatically extending the oligonucleotide identifier region to obtain a codon attached to the bifunctional complex having received the chemical entity. A second embodiment implies that genetic information of the anticodon is transferred to the nascent complexes by hybridisation to a cognate codon of the nascent complex.
  • the enriched pool of fragments comprises an affinity oligonucleotide sufficient complementary to an identifier region of the nascent complex, said oligonucleotide being distinct from the anticodon. Accordingly, the oligonucleotide identifier region of the nascent complex anneals to the affinity oligonucleotide of the building block to form the hybridisation product, while the anticodon remains single stranded. Subsequently, the chemical entity is transferred to the recipient reactive group of the complex to form the encoded molecule prior to, simultaneously with, or subsequent to the enzymatically extension of the hybridisation product using the anticodon as identifier.
  • Suitable enzymes are polymerases and ligases, which requires dNTPs and oligonucleotides, respectively as substrates.
  • the method for forming the complexes according to this first embodiment is the subject PCT/DK03/00739, the content thereof being incorporated herein by reference.
  • the anticodon form part of the affinity oligonucleotide, i.e. the anticodon is a part of or the entire affinity oligonucleotide.
  • a plurality of identifiers comprising different codons and/or different order of codons is provided.
  • the identifiers are associated with a recipient reactive group, i.e. the reactive group may be covalently attached to the identifier or attached by hybridisation.
  • a codon of the identifier may be used for the attachment of a building block harbouring the reactive group.
  • the identifiers are subsequently contacted with the enriched pool of building blocks, i.e. nucleic acid fragments associated with a transferable chemical entity.
  • the mixture of identifiers and building blocks are maintained at hybridisation conditions to anneal the anticodon of the building blocks to the cognate codon of the identifier. After or simultaneously with the annealing step, the chemical entity is transferred to the recipient reactive group of the identifier.
  • the method for forming the complexes according to the second embodiment is the subject of various patent applications, including WO 02/103008, WO 02/074929, Danish patent application No. PA 2002 01347, and U.S. provisional patent application No. 60/409,968. The content of these patent applications are incorporated herein by reference in their entirety.
  • the new generation of library complexes may be used in a partition step, in which the library of complexes is subjected to a condition partitioning complexes displaying a predetermined property from the remainder of the next generation library, as explained above.
  • a partition step in which the library of complexes is subjected to a condition partitioning complexes displaying a predetermined property from the remainder of the next generation library, as explained above.
  • the outcome of a codon analysis will be dependent of the enrichment factor in the selection process.
  • An efficient and specific selection will generate a large difference between the specific binders compared to the background. Still, there will be a large amount of molecules in the background that will reduce the possibility to obtain measurable differences between the binders and the background in the codon analysis procedure. If the enrichment factor (or too large library) is not good enough to distinguish a specific binder among the background binders, the signal in the codon analysis will probably not be detectable. However, there will be a continuing of binders that use a certain chemical entity in a certain position.
  • non optimal binders (a certain important chemical entity in one position and less important in the other position) will be many due to the diversity obtained when only one (or a few) positions are important in the selection process. Therefore, the sum of all molecules with a preferable chemical entity in a certain position will be larger than the sum of all molecules with a non-binding chemical entity, which will make the codon analysis easier.
  • This invention may involve an extensive analysis of all the chemical entities in a library and how they are involved in the binding to targets. This information can be used both to design new libraries and in the final process where the lead structures are produced and pre-clinical candidates are picked.
  • the extensive data obtained in the codon analysis can for instance be used for selecting candidates with the appropriate specificity. This can be done if selection has been performed on a family of proteins where one of the members is the target.
  • the invention enables pharmacophore identification and transformation into small molecule drugs.
  • the peptide/petdomimetic lead to small molecule conversion process is supported by medicinal chemistry and cheminformatics and guided by matching the pharmacophore derived from massive structure activity relationship (SAR) data information from the codon analysis.
  • SAR massive structure activity relationship
  • a “pharmacophore” is a description of the structural criteria a molecule must fulfil in order that it is active against a specified biological receptor. These criteria are usually the 3D spatial relationships of a set of chemical features, and sometimes include the steric boundaries, within which the molecule must fit.
  • the extensive SAR information obtained using the codon analyses described in this invention can be combined with molecular modeling technologies to refine for example pharmacophore models and the plausible interactions between the potential binders and a target.
  • the codon analysis is also a valuable experimental tool for SAR on weak binders.
  • the codon analysis measures the abundance of chemical entities after a selection in all binding molecules. Thus, even week binders, which there might be many of, is detected even though the detected codon is selected in many different combinations.
  • the selection procedure can also be tuned to enrich predominately for weak binders, which will simplify the codon analysis data.
  • This invention is also suitable for replacing the laborious task of extracting SAR information by hand with an automated process using suitable algorithm and software programs.
  • the codon analysis e.g. array or QPCR measurements
  • the SAR information and potential pharmacophore models obtained from the codon analysis can be used to design focused libraries in an array format allowing massive and parallel testing.
  • the selection procedure and codon analysis can be seen as a diversity reduction step to allow a complete test of potential binders in an array format.
  • a modified sequencing technique preferably identifies the codons in each position occurring with the highest frequency.
  • the next generation library is then build using in each position the chemical entities occurring with the highest frequency.
  • the codon identification step uses the entire population of identifier nucleic acid sequences in the analysis and informs the experimenter of the relative abundance of each codon in a certain position.
  • the codon information may be obtained using microarray, QPCR, or any equivalent method for revealing the identity of codons.
  • sequencing a subset of identifier nucleic acid sequences only provides the experimenter with a limited insight as to the population of codons and the corresponding encoded molecules.
  • the complex comprises an encoded molecule and an identifier oligonucleotide.
  • the identifier comprises codons that identify the encoded molecule.
  • the identifier oligonucleotide identifies the encoded molecule uniquely, i.e. in a library of complexes a particular identifier is capable of distinguishing the molecule it is attached to from the rest of the molecules.
  • the encoded molecule and the identifier may be attached directly to each other or through a bridging moiety.
  • the bridging moiety is a selectively cleavable linkage.
  • the identifier oligonucleotide may comprise two or more codons. In a preferred aspect the identifier oligonucleotide comprises three or more codons. The sequence of each codon can be decoded utilizing the present method to identify reactants used in the formation of the encoded molecule. When the identifier comprises more than one codon, each member of a pool of chemical entities can be identified and the order of codons is informative of the synthesis step each member has been incorporated in.
  • the same codon is used to code for several different chemical entities.
  • the structure of the encoded molecule can be deduced taking advantage of the knowledge of different attachment chemistries, steric hindrance, deprotection of orthogonal protection groups, etc.
  • the same codon is used for a group of chemical entities having a common property, such as a lipophilic nature, a certain attachment chemistry etc.
  • the codon is unique i.e. a similar combination of nucleotides does not appear on the identifier oligonucleotide coding for another chemical entity. In a practical approach, for a specific chemical entity, only a single combination of nucleotides is used.
  • the two or more codons identifying the same chemical entity may carry further information related to different reaction conditions.
  • each codon may have any suitable length.
  • the codon may be a single nucleotide or a plurality of nucleotides. In some aspects of the invention, it is preferred that each codon independently comprises four or more nucleotides, more preferred 4 to 30 nucleotides. In some aspects of the invention the lengths of the codons vary.
  • a certain codon may be distinguished from any other codon in the library by only a single nucleotide.
  • a codon length of 5 nucleotides is selected, more than 100 nucleotide combinations exist in which two or more mismatches appear.
  • the identifier oligonucleotide will in general have at least two codons arranged in sequence, i.e. next to each other. Two neighbouring codons may be separated by a framing sequence. Depending on the encoded molecule formed, the identifier may comprise further codons, such as 3, 4, 5, or more codons. Each of the further codons may be separated by a suitable framing sequence. Preferably, all or at least a majority of the codons of the identifier are separated from a neighbouring codon by a framing sequence.
  • the framing sequence may have any suitable number of nucleotides, e.g. 1 to 20. Alternatively, codons on the identifier may be designed with overlapping sequences.
  • the framing sequence may serve various purposes.
  • the framing sequence identifies the position of the codon.
  • the framing sequence either upstream or downstream of a codon comprises information which positions the chemical entity and the reaction conditions in the synthesis history of the encoded molecule.
  • the framing sequence may also or in addition provide for a region of high affinity. The high affinity region may ensure that a hybridisation event with an anti-codon will occur in frame.
  • the framing sequence may adjust the annealing temperature to a desired level.
  • a framing sequence with high affinity can be provided by incorporation of one or more nucleobases forming three hydrogen bonds to a cognate nucleobase.
  • nucleobases having this property are guanine and cytosine.
  • the framing sequence may be subjected to backbone modification.
  • back bone modifications provides for higher affinity, such as 2′-O-methyl substitution of the ribose moiety, peptide nucleic acids (PNA), and 2′-O-methylene cyclisation of the ribose moiety, also referred to as LNA (Locked Nucleic Acid).
  • the sequence comprising a codon and an adjacent framing sequence has in a certain aspect of the invention a total length of 11 nucleotides or more, preferably 15 nucleotides or more.
  • a primer may be designed to complementary to the codon sequence as well as the framing sequence. The presence of an extension reaction under conditions allowing for such reaction to occur is indicative of the presence of the chemical entity encoded in the codon as well as the position said chemical entity has in the entire synthesis history of the encoded molecule.
  • the identifier may comprise flanking regions around the coding section.
  • the flanking regions can also serve as priming sites for amplification reactions, such as PCR or as binding region for oligonucleotide probe.
  • the identifier may in certain embodiments comprise an affinity region having the property of being able to hybridise to a building block.
  • the identifier oligonucleotide may be in the sense or the anti-sense format, i.e. the identifier can be a sequence of codons which actually codes for the encoded molecule or can be a sequence complementary thereto. Moreover, the identifier may be single-stranded or double-stranded, as appropriate.
  • the encoded molecule part of the complex is generally of a structure expected of having an effect according to the property sought for, e.g. the encoded molecule has a binding affinity towards a target.
  • the encoded molecule is generally a possible drug candidate.
  • the complex may be formed by tagging a library of different possible drug candidates with a tag, e.g. a nucleic acid tag identifying each possible drug candidate.
  • the molecule formed by a variety of reactants which have reacted with each other and/or a scaffold molecule.
  • this reaction product may be post-modified to obtain the final molecule displayed on the complex.
  • the post-modification may involve the cleavage of one or more chemical bonds attaching the encoded molecule to the identifier in order more efficiently to display the encoded molecule.
  • an encoded molecule generally starts by a scaffold, i.e. a chemical unit having one or more reactive groups capable of forming a connection to another reactive group positioned on a chemical entity, thereby generating an addition to the original scaffold.
  • a second chemical entity may react with a reactive group also appearing on the original scaffold or a reactive group incorporated by the first chemical entity.
  • Further chemical entities may be involved in the formation of the final reaction product.
  • the formation of a connection between the chemical entity and the nascent encoded molecule may be mediated by a bridging molecule. As an example, if the nascent encoded molecule and the chemical entity both comprise an amine group a connection between these can be mediated by a dicarboxylic acid.
  • a synthetic molecule is in general produced in vitro and may be a naturally occurring or an artificial substance. Usually, a synthetic molecule is not produced using the naturally translation system in an in vitro process.
  • the chemical entities that are precursors for structural additions or eliminations of the encoded molecule may be attached to a building block prior to the participation in the formation of the reaction product leading the final encoded molecule.
  • the building block generally comprises an anti-codon.
  • the building blocks also comprise an affinity region providing for affinity towards the nascent complex.
  • the chemical entities are suitably mediated to the nascent encoded molecule by a building block, which further comprises an anticodon.
  • the anti-codon serves the function of transferring the genetic information of the building block in conjunction with the transfer of a chemical entity.
  • the transfer of genetic information and chemical entity may occur in any order.
  • the chemical entities are preferably reacted without enzymatic interaction in some aspects of the invention.
  • the reaction of the chemical entities is preferably not mediated by ribosomes or enzymes having similar activity.
  • enzymes are used to mediate the reaction between a chemical entity and a nascent encoded molecule.
  • the genetic information of the anti-codon is transferred by specific hybridisation to a codon on a nucleic acid identifier.
  • Another method for transferring the genetic information of the anti-codon to the nascent complex is to anneal an oligonucleotide complementary to the anti-codon and attach this oligonucleotide to the complex, e.g. by ligation.
  • a still further method involves transferring the genetic information of the anti-codon to the nascent complex by an extension reaction using a polymerase and a mixture of dNTPs.
  • the chemical entity of the building block may in most cases be regarded as a precursor for the structural entity eventually incorporated into the encoded molecule. In other cases the chemical entity provides for the eliminations of chemical units of the nascent encoded molecule. Therefore, when it in the present application with claims is stated that a chemical entity is transferred to a nascent encoded molecule it is to be understood that not necessarily all the atoms of the original chemical entity is to be found in the eventually formed encoded molecule. Also, as a consequence of the reactions involved in the connection, the structure of the chemical entity can be changed when it appears on the nascent encoded molecule. Especially, the cleavage resulting in the release of the entity may generate a reactive group which in a subsequent step can participate in the formation of a connection between a nascent complex and a chemical entity.
  • the chemical entity of the building block comprises at least one reactive group capable of participating in a reaction which results in a connection between the chemical entity of the building block and another chemical entity or a scaffold associated with the nascent complex.
  • the number of reactive groups which appear on the chemical entity is suitably one to ten.
  • a building block featuring only one reactive group is used i.a. in the end positions of polymers or scaffolds, whereas building blocks having two reactive groups are suitable for the formation of the body part of a polymer or scaffolds capable of being reacted further.
  • One, two or more reactive groups intended for the formation of connections, are typically present on scaffolds.
  • Non-limiting examples of scaffolds are opiates, steroids, benzodiazepines, hydantoines, and peptidylphosphonates.
  • the reactive group of the chemical entity may be capable of forming a direct connection to a reactive group of the nascent complex or the reactive group of the building block may be capable of forming a connection to a reactive group of the nascent complex through a bridging fill-in group. It is to be understood that not all the atoms of a reactive group are necessarily maintained in the connection formed. Rather, the reactive groups are to be regarded as precursors for the structure of the connection.
  • the subsequent cleavage step to release the chemical entity from the building block can be performed in any appropriate way.
  • the cleavage involves usage of a chemical reagent or an enzyme.
  • the cleavage results in a transfer of the chemical entity to the nascent encoded molecule or in a transfer of the nascent encoded molecule to the chemical entity of the building block.
  • the new chemical groups may be used for further reaction in a subsequent cycle, either directly or after having been activated. In other cases it is desirable that no trace of the linker remains after the cleavage.
  • connection and the cleavage is conducted as a simultaneous reaction, i.e. either the chemical entity of the building block or the nascent encoded molecule is a leaving group of the reaction.
  • a simultaneous reaction i.e. either the chemical entity of the building block or the nascent encoded molecule is a leaving group of the reaction.
  • the simultaneous connection and cleavage can also be designed such that either no trace of the linker remains or such that a new chemical group for further reaction is introduced, as described above.
  • the attachment of the chemical entity to the building block, optionally via a suitable spacer can be at any entity available for attachment, e.g. the chemical entity can be attached to a nucleobase or the backbone. In general, it is preferred to attach the chemical entity at the phosphor of the internucleoside linkage or at the nucleobase.
  • the attachment point is usually at the 7 position of the purines or 7-deaza-purins or at the 5 position of pyrimidines.
  • the nucleotide may be distanced from the reactive group of the chemical entity by a spacer moiety.
  • the spacer may be designed such that the conformational spaced sampled by the reactive group is optimized for a reaction with the reactive group of the nascent encoded molecule.
  • the encoded molecules may have any chemical structure.
  • the encoded molecule can be any compound that may be synthesized in a component-by-component fashion.
  • the synthetic molecule is a linear or branched polymer.
  • the synthetic molecule is a scaffolded molecule.
  • the term “encoded molecule” also comprises naturally occurring molecules like ⁇ -polypeptides etc, however produced in vitro usually in the absence of enzymes, like ribosomes.
  • the synthetic molecule of the library is a non- ⁇ -polypeptide.
  • the encoded molecule may have any molecular weight. However, in order to be orally available, it is in this case preferred that the synthetic molecule has a molecular weight less than 2000 Daltons, preferably less than 1000 Dalton, and more preferred less than 500 Daltons.
  • the size of the library may vary considerably pending on the expected result of the inventive method. In some aspects, it may be sufficient that the library comprises two, three, or four different complexes. However, in most events, more than two different complexes are desired to obtain a higher diversity. In some aspects, the library comprises 1,000 or more different complexes, more preferred 1,000,000 or more different complexes. The upper limit for the size of the library is only restricted by the size of the vessel in which the library is comprised. It may be calculated that a vial may comprise up to 10 14 different complexes.
  • the encoded molecules associated with an identifier oligonucleotide having two or more codons that code for reactants that have reacted in the formation of the molecule part of the complex may be formed by a variety of processes. Generally, the preferred methods can be used for the formation of virtually any kind of encode molecule. Suitable examples of processes include prior art methods disclosed in WO 93/20242, WO 93106121, WO 00/23458, WO 02/074929, and WO 02/103008, the content of which being incorporated herein by reference as well as methods of the present applicant not yet public available, including the methods disclosed in PCT/DK03/00739 filed 30 Oct. 2003, and DK PA 2003 00430 filed 20 Mar. 2003. Any of these methods may be used, and the entire content of the patent applications are included herein by reference.
  • a first embodiment disclosed in more detail in WO 02/103008 is based on the use of a polymerase to incorporate unnatural nucleotides as building blocks. Initially, a plurality of identifier oligonucleotides is provided. Subsequently primers are annealed to each of the identifiers and a polymerase is extending the primer using nucleotide derivatives, which have appended chemical entities. Subsequent to or simultaneously with the incorporation of the nucleotide derivatives, the chemical entities are reacted to form a reaction product.
  • the encoded molecule may be post-modified by cleaving some of the linking moieties to better present the encoded molecule.
  • the nucleotide derivatives can be incorporated and the chemical entities subsequently polymerised.
  • the chemical entities can be attached to adjacent chemical entities by a reaction of these reactive groups.
  • the reactive groups are amine and carboxylic acid, which upon reaction form an amide bond.
  • Adjacent chemical entities can also be linked together using a linking or bridging moiety. Exemplary of this approach is the linking of two chemical entities each bearing an amine group by a bi-carboxylic acid.
  • Yet another approach is the use of a reactive group between a chemical entity and the nucleotide building block, such as an ester or a hoister group.
  • An adjacent building block having a reactive group such as an amine may cleave the interspaced reactive group to obtain a linkage to the chemical entity, e.g. by an amide linking group.
  • a second embodiment for obtainment of complexes disclosed in WO 02/103008 pertains to the use of hybridisation of building blocks to an identifier and reaction of chemical entities attached to the building blocks in order to obtain a reaction product.
  • This approach comprises that identifiers are contacted with a plurality of building blocks, wherein each building block comprises an anti-codon and a chemical entity.
  • the anti-codons are designed such that they recognise a sequence, i.e. a codon, on the identifier. Subsequent to the annealing of the anti-codon and the codon to each other a reaction of the chemical entity is effected.
  • the identifier may be associated with a scaffold. Building blocks bringing chemical entities in may be added sequentially or simultaneously and a reaction of the reactive group of the chemical entity may be effected at any time after the annealing of the building blocks to the identifier.
  • a third embodiment for the generation of a complex includes chemical or enzymatic ligation of building blocks when these are lined up on a identifier. Initially, identifiers are provided, each having one or more codons. The identifiers are contacted with building blocks comprising anti-codons linked to chemical entities. The two or more anti-codons annealed on an identifier are subsequently ligated to each other and a reaction of the chemical entities is effected to obtain a reaction product. The method is disclosed in more detail in DK PA 2003 00430 filed 20 Mar. 2003.
  • a fourth embodiment makes use of the extension by a polymerase of an affinity sequence of the nascent complex to transfer the anti-codon of a building block to the nascent complex.
  • the method implies that a nascent complex comprising a scaffold and an affinity region is annealed to a building block comprising a region complementary to the affinity section. Subsequently, the anti-codon region of the building block is transferred to the nascent complex by a polymerase.
  • the transfer of the chemical entity may be transferred prior to, simultaneously with or subsequent to the transfer of the anti-codon. This method is disclosed in detail in PCT/DK03100739.
  • a fifths embodiment also disclosed in PCT/DK03/00739 comprises reaction of a reactant with a reaction site on nascent bifunctional molecule and addition of a nucleic acid tag to the nascent bifunctional molecule using an enzyme, such as a ligase.
  • an enzyme such as a ligase.
  • the codons are either pre-made into one or more identifiers before the encoded molecules are generated or the codons are transferred simultaneously with the formation of the encoded molecules.
  • linkers to the identifier may be cleaved, however, usually at least one linker is maintained to provide for the complex.
  • the nucleotides used in the present invention may be linked together in a sequence of nucleotides, i.e. an oligonucleotide.
  • Each nucleotide monomer is normally composed of two parts, namely a nucleobase moiety, and a backbone.
  • the backbone may in some cases be subdivided into a sugar moiety and an internucleoside linker.
  • nucleobase may be selected among naturally occurring nucleobases as well as non-naturally occurring nucleobases.
  • nucleobase includes not only the known purine and pyrimidine heterocycles, but also heterocyclic analogues and tautomers thereof.
  • nucleobases are adenine, guanine, thymine, cytosine, uracil, purine, xanthine, diaminopurine, 8-oxo-N 6 -methyladenine, 7-deazaxanthine, 7-deazaguanine, N 4 ,N 4 -ethanocytosin, N 6 ,N 6 -ethano-2,6-diamino-purine, 5-methylcytosine, 5-(C 3 -C 6 )-alkynylcytosine, 5-fluorouracil, 5-bromouracil, pseudoisocytosine, 2-hydroxy-5-methyl-4-triazolopyridine, isocytosine, isoguanine, inosine and the “non-naturally occurring” nucleobases described in Benner et al., U.S.
  • nucleobase is intended to cover these examples as well as analogues and tautomers thereof.
  • nucleobases are adenine, guanine, thymine, cytosine, 5-methylcytosine, and uracil, which are considered as the naturally occurring nucleobases in relation to therapeutic and diagnostic application in humans.
  • backbone units are shown below (B denotes a nucleobase):
  • the sugar moiety of the backbone is suitably a pentose but may be the appropriate part of a PNA or a six-member ring.
  • Suitable examples of possible pentoses include ribose, 2′-deoxyribose, 2′-O-methyl-ribose, 2′-flour-ribose, and 2′,4′-O-methylene-ribose (LNA).
  • the nucleobase is attached to the 1′ position of the pentose entity.
  • internucleoside linker connects the 3′ end of preceding monomer to a 5′ end of a succeeding monomer when the sugar moiety of the backbone is a pentose, like ribose or 2-deoxyribose.
  • the internucleoside linkage may be the natural occurring phospodiester linkage or a derivative thereof. Examples of such derivatives include phosphorothioate, methylphosphonate, phosphoramidate, phosphotriester, and phosphodithioate.
  • the internucleoside linker can be any of a number of non-phosphorous-containing linkers known in the art.
  • Preferred nucleic acid monomers include naturally occurring nucleosides forming part of the DNA as well as the RNA family connected through phosphodiester linkages.
  • the members of the DNA family include deoxyadenosine, deoxyguanosine, deoxythymidine, and deoxycytidine.
  • the members of the RNA family include adenosine, guanosine, uridine, cytidine, and inosine.
  • Inosine is a non-specific pairing nucleoside and may be used as universal base because inosine can pair nearly isoenergetically with A, T, and C.
  • Other compounds having the same ability of non-specifically base-pairing with natural nucleobases have been formed. Suitable compounds which may be utilized in the present invention includes among others the compounds depicted below
  • the chemical entities or reactants that are precursors for structural additions or eliminations of the encoded molecule may be attached to a building block prior to the participation in the formation of the reaction product leading to the final encoded molecule.
  • the building block generally comprises an anti-codon.
  • the chemical entity of the building block comprises at least one reactive group capable of participating in a reaction, which results in a connection between the chemical entity of the building block and another chemical entity or a scaffold associated with the nascent complex.
  • the connection is facilitated by one or more reactive groups of the chemical entity.
  • the number of reactive groups, which appear on the chemical entity is suitably one to ten.
  • a building block featuring only one reactive group is used i.a. in the end positions of polymers or scaffolds, whereas building blocks having two reactive groups are suitable for the formation of the body part of a polymer or scaffolds capable of being reacted further.
  • One, two or more reactive groups intended for the formation of connections are typically present on scaffolds.
  • the reactive group of the building block may be capable of forming a direct connection to a reactive group of the nascent complex or the reactive group of the building block may be capable of forming a connection to a reactive group of the nascent complex through a bridging fill-in group. It is to be understood that not all the atoms of a reactive group are necessarily maintained in the connection formed. Rather, the reactive groups are to be regarded as precursors for the structure of the connection.
  • the subsequent cleavage step to release the chemical entity from the building block can be performed in any appropriate way.
  • the cleavage involves usage of a reagent or an enzyme.
  • the cleavage results in a transfer of the chemical entity to the nascent encoded molecule or in a transfer of the nascent encoded molecule to the chemical entity of the building block.
  • the new chemical groups may be used for further reaction in a subsequent cycle, either directly or after having been activated. In other cases it is desirable that no trace of the linker remains after the cleavage.
  • connection and the cleavage are conducted as a simultaneous reaction, i.e. either the chemical entity of the building block or the nascent encoded molecule is a leaving group of the reaction.
  • a simultaneous reaction i.e. either the chemical entity of the building block or the nascent encoded molecule is a leaving group of the reaction.
  • the simultaneous connection and cleavage can also be designed such that either no trace of the linker remains or such that a new chemical group for further reaction is introduced, as described above.
  • the attachment of the chemical entity to the building block, optionally via a suitable spacer can be at any entity available for attachment, e.g. the chemical entity can be attached to a nucleobase or the backbone. In general, it is preferred to attach the chemical entity at the phosphor of the internucleoside linkage or at the nucleobase.
  • the attachment point is usually at the 7 position of the purines or 7-deaza-purins or at the 5 position of pyrimidines.
  • The. nucleotide may be distanced from the reactive group of the chemical entity by a spacer moiety.
  • the spacer may be designed such that the conformational space sampled by the reactive group is optimized for a reaction with the reactive group of the nascent encoded molecule or reactive site.
  • the anticodon complements the codon of the identifier oligonucleotide sequence and generally comprises the same number of nucleotides as the codon.
  • the anti-codon may be adjoined with a fixed sequence, such as a sequence complementing a framing sequence.
  • the building block indicated below is capable of transferring a chemical entity (CE) to a recipient nucleophilic group, typically an amine group.
  • CE chemical entity
  • the bold lower horizontal line illustrates the building block comprising an anti-codon and the vertical line illustrates a spacer.
  • the 5membered substituted N-hydroxysuccinimid (NHS) ring serves as an activator, i.e. a labile bond is formed between the oxygen atom connected to the NHS ring and the chemical entity.
  • the labile bond may be cleaved by a nucleophilic group, e.g. positioned on a scaffold
  • the 5-membered substituted N-hydroxysuccinimid (NHS) ring serves as an activator, i.e. a labile bond is formed between the oxygen atom connected to the NHS ring and the chemical entity.
  • the labile bond may be cleaved by a nucleophilic group, e.g. positioned on a scaffold, to transfer the chemical entity to the scaffold, thus converting the remainder of the fragment into a leaving group of the reaction.
  • the chemical entity is connected to the activator through a carbonyl group and the recipient group is an amine, the bond formed on the scaffold will an amide bond.
  • Another building block which may form an amide bond, is
  • R may be absent or NO 2 , CF 3 , halogen, preferably Cl, Br, or I, and Z may be S or O.
  • halogen preferably Cl, Br, or I
  • Z may be S or O.
  • a nucleophilic group can cleave the linkage between Z and the carbonyl group thereby transferring the chemical entity —(C ⁇ O)—CE′ to said nucleophilic group.
  • a building block as shown below is able to transfer the chemical entity to a recipient aldehylde group thereby forming a double bond between the carbon of the aldehyde and the chemical entity
  • the below building block is able to transfer the chemical entity to a recipient group thereby forming a single bond between the receiving moiety, e.g. a scaffold, and the chemical entity.
  • Another building block capable of transferring a chemical entity to a receiving reactive group forming a single bond is
  • the receiving group may be a nucleophile, such as a group comprising a hetero atom, thereby forming a single bond between the chemical entity and the hetero atom, or the receiving group may be an electronegative carbon atom, thereby forming a C—C bond between the chemical entity and the scaffold.
  • a nucleophile such as a group comprising a hetero atom, thereby forming a single bond between the chemical entity and the hetero atom
  • the receiving group may be an electronegative carbon atom, thereby forming a C—C bond between the chemical entity and the scaffold.
  • the chemical entity attached to any of the above building blocks may be a selected from a large arsenal of chemical structures.
  • Examples of chemical entities are H or entities selected among the group consisting of a C 1 -C 6 alkyl, C 2 -C 6 alkenyl, C 2 -C 6 alkynyl, C 4 -C 8 alkadienyl, C 3 -C 7 cycloalkyl, C 3 -C 7 cycloheteroalkyl, aryl, and heteroaryl, said group being substituted with 0-3 R 4 , 0-3 R 5 and 0-3 R 9 or C 1 -C 3 alkylene-NR 4 2 , C 1 -C 3 alkylene-NR 4 C(O)R 8 , C 1 -C 3 alkylene-NR 4 C(O)OR 8 , C 1 -C 2 alkylene-O—NR 4 2 , C 1 -C 2 alkylene-O—NR 4 C(O)R 8 , C 1 -C 2 al
  • a reactive group appearing on the chemical entity precursor reacts with a recipient reactive group, e.g. a reactive group appearing on a scaffold, thereby forming a cross-link.
  • a cleavage is performed, usually by adding an aqueous oxidising agent such as I 2 , Br 2 , Cl 2 , H + , or a Lewis acid. The cleavage results in a transfer of the group HZ-FEP- to the recipient moiety, such as a scaffold.
  • R 8 is H, C 1 -C 6 alkyl, C 2 -C 6 alkenyl, C 2 -C 6 alkynyl, C 3 -C 7 cycloalkyl, aryl or C 1 -C 6 alkylene-aryl substituted with 0-3 substituents independently selected from —F, —Cl, —NO 2 , —R 3 , —OR 3 , —SiR 3 3
  • R 9 is ⁇ O, —F, —Cl, —Br, —I, —CN, —NO 2 , —OR 6 , —NR 6 2 , —NR 6 —C(O)R 8 , —NR 6 —C(O)OR 8 , —SR 6 , —S(O)R 6 , —S(O) 2 R 6 , —COOR 6 , —C(O)NR 6 2 and —S(O) 2 NR 6 2 .
  • Z is O or S
  • P is a valence bond
  • Q is CH
  • B is CH 2
  • R 1 , R 2 , and R 3 is H.
  • the bond between the carbonyl group and Z is cleavable with aqueous I 2 .
  • the partition step may be referred to as a selection or a screen, as appropriate, and includes the screening of the library for encoded molecules having predetermined desirable characteristics.
  • Predetermined desirable characteristics can include binding to a target, catalytically changing the target, chemically reacting with a target in a manner which alters/modifies the target or the functional activity of the target, and covalently attaching to the target as in a suicide inhibitor.
  • the target can be any compound of interest.
  • the target can be a protein, peptide, carbohydrate, polysaccharide, glycoprotein, hormone, receptor, antigen, antibody, virus, substrate, metabolite, transition state analogue, cofactor, inhibitor, drug, dye, nutrient, growth factor, cell, tissue, etc. without limitation.
  • Particularly preferred targets include, but are not limited to, angiotensin converting enzyme, renin, cyclooxygenase, 5-lipoxygenase, IIL-10 converting enzyme, cytokine receptors, PDGF receptor, type II inosine monophosphate dehydrogenase, ⁇ -lactamases, integrin, and fungal cytochrome P-450.
  • Targets can include, but are not limited to, bradykinin, neutrophil elastase, the HIV proteins, including tat, rev, gag, int, RT, nucleocapsid etc., VEGF, bFGF, TGF ⁇ , KGF, PDGF, thrombin, theophylline, caffeine, substance P, IgE, sPLA2, red blood cells, glioblastomas, fibrin clots, PBMCs, hCG, lectins, selectins, cytokines, ICP4, complement proteins, etc.
  • Encoded molecules having predetermined desirable characteristics can be partitioned away from the rest of the library while still attached to the identifier nucleic acid sequence by various methods known to one of ordinary skill in the art.
  • the desirable products are partitioned away from the entire library without chemical degradation of the attached nucleic acid identifier such that the identifiers are amplifiable.
  • the identifiers may then be amplified, either still attached to the desirable encoded molecule or after separation from the desirable encoded molecule.
  • the desirable encoded molecule acts on the target without any interaction between the nucleic acid attached to the desirable encoded molecule and the target.
  • the bound complex-target aggregate can be partitioned from unbound complexes by a number of methods. The methods include nitrocellulose filter binding, column chromatography, filtration, affinity chromatography, centrifugation, and other well known methods.
  • the library of complexes is subjected to the partitioning step, which may include contact between the library and a column onto which the target is immobilised.
  • Identifier nucleic acids associated with undesirable encoded molecules i.e. encoded molecules not bound to the target under the stringency conditions used, will pass through the column.
  • Additional undesirable encoded molecules e.g. encoded molecules which cross-react with other targets
  • Desirable complexes are bound to the column and can be eluted by changing the conditions of the column (e.g., salt, pH, surfactant, etc.) or the identifier.
  • encoded molecules which react with a target can be separated from those products that do not react with the target.
  • a chemical compound which covalently attaches to the target such as a suicide inhibitor
  • the resulting complex can then be treated with proteinase, DNAse or other suitable reagents to cleave a linker and liberate the nucleic acids which are associated with the desirable chemical compound.
  • the liberated nucleic acids can be amplified.
  • the predetermined characteristic of the desirable product is the ability of the product to transfer a chemical group (such as acyl transfer) to the target and thereby inactivate the target.
  • a chemical group such as acyl transfer
  • the desirable products Upon contact with the target, the desirable products will transfer the chemical group to the target concomitantly changing the desirable product from a thioester to a thiol. Therefore, a partitioning method which would identify products that are now thiols (rather than thioesters) will enable the selection of the desirable products and amplification of the nucleic acid associated therewith.
  • the products can be fractionated by a number of common methods and then each fraction is then assayed for activity.
  • the fractionization methods can include size, pH, hydrophobicity, etc.
  • Inherent in the present method is the selection of encoded molecules on the basis of a desired function; this can be extended to the selection of molecules with a desired function and specificity. Specificity can be required during the selection process by first extracting identifier nucleic acid sequences of chemical compounds which are capable of interacting with a non-desired “target” (negative selection, or counter-selection), followed by positive selection with the desired target.
  • a non-desired “target” negative selection, or counter-selection
  • inhibitors of fungal cytochrome P450 are known to cross-react to some extent with mammalian cytochrome P450 (resulting in serious side effects).
  • Highly specific inhibitors of the fungal cytochrome could be selected from a library by first removing those products capable of interacting with the mammalian cytochrome, followed by retention of the remaining products which are capable of interacting with the fungal cytochrome.
  • FIG. 1 illustrates the overall process of building block evolution.
  • FIG. 2 shows the distribution of codon in different positions in an output from a selection.
  • FIG. 3 shows the difference between identifier driven and building block driven evolution.
  • FIG. 4 shows a method for reducing the library diversity through codon analysis.
  • FIG. 5 discloses two embodiments of using a Taqman probe (5′ nuclease probe) in the measurement of the presence or absence of a certain codon.
  • FIG. 6 shows a standard curve referred to in example 4.
  • FIG. 7 shows a result of example 4.
  • FIG. 8 discloses a result of example 4.
  • FIG. 9 discloses a scheme relating to combined structural information and codon abundances in library design.
  • FIG. 10 discloses a relationship between codon analysis and structural information.
  • FIG. 11 shows the detection of single codons of identifiers.
  • FIG. 12 shows the detection of codon pairs of identifiers.
  • FIG. 13 shows the detection of codon pairs at specific codon positions.
  • FIG. 14 shows the detection of single codons of identifiers after the separation of the individual codons.
  • FIG. 15 discloses a method for selecting from a library, complexes capable of binding to a target molecule.
  • FIG. 16 discloses a method for enriching specific nucleic acid fragments and the utility of these fragments for the generation of a new library.
  • FIG. 17 discloses a method for reducing the diversity of a library of complexes.
  • FIG. 1A Shows the principle steps in BB evolution.
  • An initial library of desired size is produced.
  • This initial library is subjected to a selection process where encoded molecules that associate with a target of interest are enriched.
  • the encoding identifier oligonucleotide is preferably amplified and the used in the codon analysis step. This step monitors the relative abundance of each codon in the selected library.
  • the information obtained in this analysis is used to design a new enriched library, which contains the preferable chemical entities and their corresponding codons.
  • This new library is then subjected to a new selection process to select for binders. This diversity reduction cycle can be repeated until the desirable result is obtained and the binders have been obtained.
  • FIG. 1B shows how the diversity of a library (n 4 ) is reduced by reducing the number of chemical entities (n) in the library.
  • the identifier oligonucleotide that encodes for the display molecule is composed of codons and associated with the encoded molecule, as shown in FIG. 2 . These codons possess information about the chemical entities in the encoded molecule. Each of these codon positions can be analysed for the precise sequence, which will reflect which chemical entities that have been enrich for in the selection process. The relative amount can also be obtained by comparing the signal in the measuring procedure (e.g. QPCR and array analysis). Each codon position will have its own fingerprint on which chemical entities that the selected display molecules possess. These fingerprints in each position can subsequently be used to put together a new more focused library with a lower and more enrich diversity that can be subjected to another round of selection. This can then repeated until the preferable encoded molecules have been obtained.
  • FIG. 3 illustrates the main difference between identifier and chemical entity (CE) evolution.
  • CE chemical entity
  • the new library is designed to have an equal amount of each selected chemical entity, which will generate all the possible display molecules at the same concentration. This will allow all binders to compete at the same concentration and potentially retain a more diverse set of binders in each round of selection. This is especially important for small molecules here not only the affinity is of interest.
  • FIG. 4 This illustrates the process where the diversity is reduced through the codon analysis.
  • An initial library of 10 10 e.g. 317*317*317*317) library members is subjected to a selection.
  • the enrich identifier oligonucleotides are amplified and used in the codon analysis.
  • the codon analysis result is used to design a new 10 7 (e.g. 57*57*57*57) library where the enriched chemical entities are included.
  • This new library is the again subjected to a selection process.
  • the identifier oligonucleotides are amplified and used for codon analysis.
  • This new codon analysis results is again used to design a new 10 4 (e.g. 10*10*10*10) library where the enriched chemical entities are included.
  • Finally a last selection step is performed in this reduced diversity library to identify the binders.
  • a preferred embodiment of the invention utilizing a universal Taqman probe is shown in FIG. 5 .
  • Four codons are shown (P1through P4; bold pattern) along with flanking regions (light pattern).
  • a universal Taqman probe anneals to a region adjacent to the codon region, but within the amplicon defined by the universal PCR primers Pr.1 and Pr. 2. These primers could be the same as used for amplification of the identifier oligonucleotides encoding binders after an enrichment process on a specific target. However, are minimal length identifiers preferred during the encoding process, the region involved in Taqman probe annealing could be appended to the library identifier oligonucleotides by e.g.
  • the added length corresponding to the region necessary for annealing of the Taqman probe would be form 20 to 40 nts depending on the type of TaqMan probe and T A of the PCR primers.
  • the Q-PCR reactions are preferably performed in a 96- or 384-well format on a real-time PCR thermocycling machine.
  • Panel A shows the detection of abundance of a specific codon sequence in position one. Similar primers are prepared for all codon sequences. For each codon sequence utilized to encode a specific BB in the library a Q-PCR reaction is performed with a primer oligonucleotide complementary to the codon sequence in question. A downstream universal reverse primer Pr. 2 is provided after the Taqman probe to provide for an exponential amplification of the PCR amplicon. The setup is most suited for cases where the codon constitutes a length corresponding to a length suitable for a PCR primer.
  • Panel B shows the detection of abundance of a specific codon sequence in a specific codon position using a primer, which is complementing a codon and a framing sequence. Similar primers are used for all the codons and framing sequences.
  • a Q-PCR reaction is performed with an oligo complementary to the codon sequence in question as well as a short region up- or downstream of the codon region which ensures extension of the primer in a PCR reaction only when annealed to the codon sequence in that specific codon position.
  • the number of specific primers and Q-PCR reactions needed to cover all codon sequences in all possible codon positions equals the number of codon sequences times the number of codon positions.
  • monitoring the abundance of 96 different codon sequences in 4 different positions can be performed in a single run on four 96 wells micro titre plates (as shown in Panel B) or a single 384 well plate on a suitable instrument.
  • Quantification is performed relative to the amount of full-length PCR product obtained in a parallel control reaction on the same input material performed with the two external PCR primers Pr.1+Pr. 2. Theoretically, a similar rate of accumulation of this control amplicon compared to the accumulation of a product utilizing a single codon+sequence specific primer would indicate a 100% dominance of this particular sequence in the position in question.
  • Panel A and B employ a Taqman probe strategy
  • other detection systems SYBR green, Molecular Beacons etc.
  • multiplex reactions employing up to 4 different fluorofors in the same reaction could increase throughput correspondingly.
  • FIG. 9 This figure illustrates the possibility to combine structural information about the chemical entities and the relative abundance when designing a new more focused library.
  • the structural information about the chemical entities can be used at least in two ways. First the similarities between the chemical entities in each position can be used to choose chemical entities to a new library. Secondly, the combination of the selected chemical entities can be analyzed to investigate possible pattern that generate potential ligands. This is especially useful if the binding site or the structure of a known ligand is known. Any type of structural analysis tool can be used that generate information about the structure of separate chemical entities or combination of chemical entities (the potential binders). By combining these three analysis approaches a more focused library can be generated that potentially will contain more specific binders compare to background binders. This new focused library can be used in another round of selection to reduce the diversity. This procedure can be repeated until the desired binders have been identified.
  • FIG. 10 This figure shows how the combination of codon analysis and structural information can generate valuable information.
  • This invention allows the performance of structure activity relationship analysis (SAR) where the relative abundance in the codon analysis will represent the activity parameter (e.g. IC 50 values) in the SAR measurements.
  • SAR structure activity relationship analysis
  • Pharmacophore models can be generated, focused libraries can be designed, certain follow up chemistry can be used and information in the hit to lead process can be used.
  • FIG. 11 shows an array detection system in which a single codon is detected.
  • a library of selected complexes i.e. complexes comprised of the initial library, which display a certain property, is provided as disclosed above.
  • the initial library of complexes is prepared from e.g. 100 codons and identifiers having 4 codons in sequence, which theoretical gives a library of 10 8 complexes.
  • the selected complexes are subjected to amplification to amplify the identifiers of the selected complexes and the amplification products are added to an array (30).
  • the array (30) comprises probes (32) complementary to each of the codons of the identifiers (31).
  • the PCR products of the identifiers are annealed to the cognate probes of the array and in a suitable scanner the spatial position of the annealed probes are detected to elucidate the codons (33) of the identifier.
  • the quantity of each codon may be measured to find codons abundant in more than one identifier and/or codons leading to encoded molecules with high affinity.
  • the information may be used for decoding of the encoded molecule of the complexes displaying the desired property or the information may be used for selection of building blocks, which is to be added in a next round of library formation.
  • FIG. 12 discloses an array detection system for establishing codons pairs, i.e. codons in the vicinity of each other.
  • a library of complexes is prepared from 100 different codons deposited on an identifier in a sequence of four, making the total amount of combinations possible 10 8 .
  • the initial library is subjected to a condition in order to select a sub-library (29) displaying a desired property.
  • the identifiers of the sub-library are amplified by a PCR reaction and the reaction product is added under hybridisation conditions to an array (34).
  • the array is designed with probes (35) capable of detecting two codons at a time. To cover all possible combinations of a library based on 100 different codons 10 4 probes are needed, which is practically feasible with the current technology.
  • the detection of the codons may be conducted quantitatively, i.e. the relative abundance of each of the codon pairs may be determined.
  • the detection on the array may be used to reconstruct the selected identifiers (36) as three overlapping codon pair detections depict the entire identifier.
  • the information on the relative abundance of each codon pair maybe used to decipher the sequence of codons of the selected identifiers as it can be assumed that each codon pair of the same identifier appears in the same amounts in the PCR products added to the array.
  • FIG. 13 discloses an array for detecting codon pairs at specific codon positions.
  • a library of complexes comprising identifiers with framing sequences is provided.
  • the framing sequence is specific for each position of the codons on the identifier.
  • Four times more probes on the microarray is needed per each codon if the position of the codons also should be detected in the analysis which is practically feasible with current technology.
  • the position is detected due to the framing sequences next to each codon.
  • the initial library is subjected to a selection process to isolate complexes (37) having a desired property.
  • the selected complexes are amplified by a PCR reaction and the reaction products are added to an array (38).
  • the array comprises probes capable of detecting codon pairs as wells as the framing sequences (40) between the codons.
  • the framing sequence determines the position of the codon in the reaction history, i.e. it is possible to deduct which chemical entity that reacted at which point in time of the synthesis history of the encoded molecule, thus making it possible to reconstruct the structure of the encoded molecule.
  • the detection of the codon pairs may be conducted quantitatively, i.e. the relative abundance of each of the codon pairs may be determined.
  • the detection on the array may be used to reconstruct the selected identifiers (41) as three overlapping codon pair detections depict the entire identifier.
  • the information on the relative abundance of each codon pair maybe used to decipher the sequence of codons of the selected identifiers as it can be assumed that each codon pair of the same identifier appears in the same amounts in the PCR products added to the array.
  • FIG. 14 shows an array detection system in which a single codon is detected.
  • a library of selected complexes i.e. complexes comprised of the initial library which display a certain property, is provided as disclosed above.
  • the initial library of complexes is prepared from e.g. 100 codons and identifiers having 4 codons in sequence, which theoretical gives a library of 10 8 complexes.
  • the selected complexes are subjected to amplification to amplify the identifiers of the selected complexes and the amplification products are treated with suitable reagents to cut between the individual codons (43).
  • the individual codon is the applied to the array.
  • the array (44) comprises probes (45) complementary to each of the codons of the identifiers (46).
  • the PCR products of the identifiers are annealed to the cognate probes of the array and in a suitable scanner the spatial position of the annealed probes are detected to elucidate the codons (47) of the identifier.
  • the quantity of each codon may be measured to find codons abundant in more than one identifier and/or codons leading to encoded molecules with high affinity.
  • the information may be used for decoding of the encoded molecule of the complexes displaying the desired property or the information may be used for selection of building blocks, which is to be added in a next round of library formation.
  • FIG. 15 discloses a method for selection of a suitable complex in several steps.
  • the library of complexes 1 is provided.
  • Each member of the library comprises an encoded molecule 2 composed of four chemical entities which is attached to an identifier oligonucleotide 3, which comprises four codons.
  • the initial library shown comprises three complexes.
  • the library of complexes is incubated with immobilized target molecules 4.
  • the encoded molecule having an affinity towards the target molecule is bound to the immobilized target whereas encoded molecules not having affinity towards the target under the conditions used remains in the liquid media.
  • the complexes remaining in the liquid media are discarded by a washing process, while the bound complexes remain attached to the immobilized target molecules.
  • the washing process is usually conducted using mild stringency conditions in the initial rounds of selection. In later stage selections the working stringency conditions are usually increased to allow only high affinity binders to remain attached to the target. Subsequent to the washing step the complexes having affinity towards the target molecule are recovered. The recovery process usually requires high stringency conditions to detach the encoded molecule from the immobilized the target.
  • the selected sub-library resulting from the elution is subjected to an amplification process.
  • the amplification of the identifier nucleic acid sequence of the selected complexes is usually performed using the PCR method. Preferably, a modification of the PCR method is followed such that a biotin molecule is attached to one of the primers to obtain a handle for subsequent immobilization.
  • the result of the amplification step is multiple copies of the identifier nucleic acid sequences, which codes for the encoded molecules which have survived the selection step.
  • FIG. 16 discloses an enrichment process of building blocks.
  • the building blocks can be used for generation of a new library.
  • identifier nucleic acid sequences are immobilized on solid support.
  • the identifier nucleic acid sequences are the product of the selection procedure described in FIG. 1 .
  • Each codon of the identifier nucleic acid sequence is identified with an uppercase letter, i.e. A, B, C, or D.
  • the immobilized identifier acid sequences are contacted with the pool of building blocks under hybridisation conditions.
  • Each of the building blocks are illustrated with an sequence complementary to a codon which may or may nor be present on the identifier nucleic acid sequence.
  • the complementary sequences are indicated with a apostrophe, e.g.
  • the transferable chemical entity of a building block is illustrated with a lowercase letter.
  • the conditions providing for hybridisation of the complementing sequences of the pool of building blocks to the immobilised identifier nucleic acid sequence are preferably such that cognate nucleic acid sequences are hybridised to each other while sequences not recognizing any immobilized sequence remain in aqueous media.
  • the immobilized sequences of the identifier nucleic acid sequences are thus used as bait in catching building blocks with complementing sequences.
  • non-binding building blocks are removed by washing, whereby the part of the pool of building blocks not being able to find a complementing sequence is discarded.
  • the building blocks attached to the immobilized nucleic acid sequences are detached using dehybridisation conditions.
  • the diminished pool of building blocks may be used in a subsequent round for preparing a new library of complexes, in which the encoded molecule comprises a reaction product comprising additions from chemical entities attached to the enriched building blocks. Because the order of building blocks which have participated in the formation of the encoded molecules successful in the selection procedure, is not preserved by the method for enriching building blocks a scrambling of the encoded molecules may be obtained in some of the methods described herein for obtaining a library of complexes. In some applications of the library it will be an advantage to have a scrambling of the building blocks because and increased diversity is obtained.
  • FIG. 17 discloses a method for reducing the diversity of the library of complexes resulting from the method described in FIG. 16 .
  • the diversity induced by scrambling of the building blocks are not desired.
  • the sequences complementary to the identifier acid sequences used in FIG. 16 are provided and immobilized on a suitable solid support.
  • the complementary sequence is obtained from the PCR product resulting from the method according to FIG. 15 .
  • the complementing sequence may be obtained by extending the identifier nucleic acid sequence using a suitable primer, optionally attached to a handle such as a biotin or dinitrophenol.
  • the immobilized complementary sequence is incubated with the scrambled library under conditions, which provide for hybridisation between the complementary sequence and members of the library having affinity towards this sequence.
  • Members of the library not having affinity to the complementary sequences remains in the media and is discarded, while members of the library being able to hybridise to the immobilized nucleic acid sequences is recovered. Occasionally, nucleic acids not perfectly matching with the complementary sequence immobilized on the solid support are caught.
  • the hybridisation products prior to the recovery step, are treated with an enzyme capable of recognizing mismatching nucleotides and cleaving the double stranded helix in which they are situated.
  • An example of an enzyme with this ability is T4 endonuclease VII.
  • This identifier oligonucleotide was immobilized on streptavidin beads using standard protocols, i.e. 600 pmol identifier oligonucleotide with 5′-dT biotin in 50 ⁇ l 100 mM Mes pH 6.0 was mix with 50 ⁇ l SA-magnetic beads (Roche). The mixture was washed 2-3 times with 100 mM MES pH 6.0 to remove non-bound identifier oligonucleotides. To reduce background binding, the oligos and beads was incubated at RT for 10 min on shaker, then incubated on ice for 10 min while rotating the tube. Finally, the sample was washed with 100 mM MES 4 times in 800 ⁇ l at 60° C.
  • the complementing (non-sense) strand is removed using 10 mM NaOH. This will generate single-stranded DNA with the selected codons.
  • the same procedure described in this example can be used for a collection of different identifier nucleic acid molecules that contain one or more codons.
  • the codons in the identifier nucleic acid molecules can be the same or different determined from the enrichment performed on the initial library.
  • the immobilized identifier nucleic acid molecule was mixed with the pool of nucleic acid fragments shown below.
  • This pool of fragments illustrates an original pool that was used for generating an initial library of complexes.
  • Each fragment may possess in the 3′-end a specific chemical entity that is encoded by the codon sequence.
  • These nucleic acid fragments contain a specific sequence in the codon region (underlined) while the framing region shown in boldface is identical among the fragments.
  • the pool of fragments represents different codons in the same position of the identifier nucleic acid.
  • the nucleic acid fragments are mixed with the immobilized identifier nucleic acid using 600 pmol of each nucleic acid fragment mixed with the immobilized identifier nucleic acid molecules (100 mM MES pH 6.0, 150 mM NaCl)). The mixture was incubated at 25° C. for 30 minutes in a shaker. The non-hybridized fragments were removed by 4 times washing in 800 ⁇ l 100 mM MES, 150 mM NaCl. This step should separate the complementing fragments (bound) encoding for the select chemical entities from the non-complementing fragments (non-bound) encoding for chemical entities that were not effective in the preceding selection process.
  • the annealed fragments are eluted from the immobilized identifier nucleic acid molecules by re-suspending the beads in 25 ⁇ l 60° C. H 2 O and incubating for 2 min at 60° C.
  • the enriched fragments were purified on a micro-spin gel filtration column (BiRad).
  • the eluted fragments were prepared for mass spectroscopy (MS) analysis by mixing in half volume of ion exchanger resin and incubating minimum 2 h at 25° C. on a shaker.
  • the mass from the correct complementary fragment (number 1) is obtained in the MS analysis (11438.39, expected 11439 Da) No masses for the other fragments (number 2-4) could not be found in the MS spectra (expected masses; 11415, 11430, 11424 Da).
  • This result shows that the right fragment is strongly enriched and other fragments with the wrong codon sequences are removed. The enrichment is possible even when the “spacing” region (boldface) is identical in each fragment.
  • Two control experiments were also performed to validate the enrichment protocol. In the first experiment, the fragment with the correct codon sequence (number 1) was mixed with the immobilized identifier molecule as described above. The sample was washed end eluted also as described above and prepared for MS analysis. The result from the MS analysis is shown below.
  • the result indicates that the fragment with the correct sequence (number 1) anneals to the immobilized identifier molecules and is eluted under the conditions used in this example.
  • the expected mass (11439) correlate well with the experimental mass, 11438.39 Da.
  • the enriched fragments obtained using this strategy may then be used to generate a new library of encoded molecules.
  • This new library will contain encoded molecules composed of the enriched chemical entities.
  • the library size have been reduced due to the removal of chemical entities not involved in binding encoded molecules, and enriched in chemical entities that are highly represented in the encoded molecules which binds to the target molecule.
  • Example 1 shows the possibility of enriching for specific building block molecules, i.e. nucleic acid fragments associated with transferable chemical entities.
  • the same procedure can be used for a larger pool of building block than four as used herein.
  • the codon design will determine the maximum number of building blocks that can be used.
  • the sequence in the codon region should be large enough to allow discrimination in the annealing step.
  • Various conditions can be used to increase the stringency in the annealing step. Parameters such as temperature, salt, pH, formamide concentration, time and other conditions could be used.
  • This example describes the enrichment of building blocks using an identifier nucleic acid (identifier) molecule with multiple codons. These codons encode for a displayed molecule (DM) that are attached to the identifier molecule before the selection is performed.
  • the library size is determined both by the number of different chemical entities and the total number of chemical entities.
  • the identifier molecule shown below contains three codons. The codons, which codes for the displayed molecule are indicated with underlines and the region separating (framing region) the codons in boldface.
  • the size of the codons can be varied dependent in the diversity need in the library and the optimal setup for chemical entity enrichment.
  • the framing region can also be varied dependent on the discrimination needed to distinguish the precise position of a codon in the identifier molecule.
  • the framing region will also be important for the generation of the library. This can be understood when the encoding is accomplished by extension of the encoding region as disclosed in DK PA 2002 01955 and U.S. No. 60/434,425, incorporated herein by reference. There need to be a perfect match in the 3′-end in order to get efficient extension with a polymerase or a ligase.
  • the size of this spacing/framing region should be long enough to form a complementing region to allow extension with a polymerase or ligase.
  • the spacing region should be between 3 and 6 nucleotides.
  • the codon region together with the spacing region will also be useful when codons are to be identified using a micro array setup. The identifier molecule with the right codon sequences will hybridize to the array and be detected.
  • This identifier molecule is amplified with two primers (below) using a standard PCR reaction. For example, 500 nM of each primer, 2,5 units Taq polymerase, 0.2 mM of each NTP, in a PCR buffer (50 mM KCl, 10 mM Tris-Cl, 3 mM DTT, 1.5 mM MgCl 2 , 0.1 mg/ml BSA). Run 25 cycles (94° C. melt for 30 seconds, 55° C. anneal for 45 seconds, 72° C. extension for 60 seconds). B-GCACACTAGCTTGAGCACACTGACA-3′ CGAAATGCTAGGGCGTCCATTGGCA-5′
  • This amplified product is then immobilized on a solid support, streptavidin beads for example. This can be performed identical as describe in example 1.
  • the complementing non-sense stand is removed by incubating in 10 NaOH for about 2 min and washed with 100 mM Mes buffer, pH 6.0. This procedure will generate the strand shown below where the codon regions are exposed to allow hybridization with the complementing sequences.
  • the next step is to protect the complementing sequences outside the codons to prevent the binding of the building block to these sequences. This can be performed by adding “blocking” oligonucleotides that has a complementing sequence. This is shown below.
  • the pool of different building blocks is added and is allowed annealing to the codon region in the identifier region.
  • the position of annealing is determined by the spacing region shown in boldface.
  • the stringency is adjusted to only allow hybridization of the correct building block in the right position. This can be accomplished by mixing the right component together using various conditions.
  • the condition can for example include the presence of salt, formamide and various buffers adjusted to suitable pH and temperature.
  • Below is the correct building block that will anneal to the enriched identifier molecules. These building blocks is annealed and eluted as described in example 1.
  • next pool of building blocks is blocked with an oligonucleotide that also protects the first codon. This is necessary to prevent binding of the building blocks in that codon.
  • the identifier molecule is protected with a blocking oligo that expose only the last codon.
  • the enrichment of each library of building blocks are performed in separate tubes in order to keep the libraries of building block separated.
  • the enrichment is performed with building blocks loaded with chemical entities (CE).
  • the graph below illustrates the relationship between the number of chemical entitles and the library size.
  • the example below is calculated on that the final encoded molecules contains four chemical entities that is individually encoded by the corresponding building block (n 4 , where n is the number of building blocks).
  • the graph shows that the diversity decreases dramatically with the reduction of the total number of building blocks. If the number of different building can be reduced to about 20-30 (library size of 16*10 3 and 81*10 4 , respectively) in the selection process, then the library size for the final round of selection is low enough for identification of the binding molecules.
  • This example illustrates one possibility to perform codon analysis on a whole population of different identifier oligonucleotides.
  • the analysis can also be performed using array where the probe oligonucleotides (complementary to the codons) are immobilized in discreet areas and the signal is monitored dependent on the amount of identifiers oligonucleotides are hybridised in each specific area. Codon analysis can also be performed using standard sequencing using a polymerase extension step.
  • a universal Taqman probe anneals to a region adjacent to the codon region, but within the amplicon defined by the universal PCR primers Pr.1 and Pr. 2. These primers could be the same as used for amplification of the identifier oligonucleotides encoding binders after an enrichment process on a specific target. However, are minimal length identifiers preferred during the encoding process, the region involved in Taqman probe annealing could be appended to the library identifier oligonucleotides by e.g.
  • the added length corresponding to the region necessary for annealing of the Taqman probe would be form 20 to 40 nts depending on the type of TaqMan probe and T A of the PCR primers.
  • the Q-PCR reactions are preferably performed in a 96 or 384-well format on a real-time PCR thermocycling machine.
  • FIG. 5 panel A, shows the detection of abundance of a specific codon sequence in position one. Similar primers are prepared for all codon sequences. For each codon sequence utilized to encode a specific BB in the library a Q-PCR reaction is performed with a primer oligonucleotide complementary to the codon sequence in question. A downstream universal reverse primer Pr. 2 is provided after the Taqman probe to provide for an exponential amplification of the PCR amplicon. The setup is most suited for cases where the codon constitutes a length corresponding to a length suitable for a PCR primer.
  • FIG. 5 panel B shows the detection of abundance of a specific codon sequence in a specific codon position using a primer which is complementing a codon and a framing sequence. Similar primers are used for all the codons and framing sequences.
  • a Q-PCR reaction is performed with an oligo complementary to the codon sequence in question as well as a short region up- or downstream of the codon region which ensures extension of the primer in a PCR reaction only when annealed to the codon sequence in that specific codon position.
  • the number of specific primers and Q-PCR reactions needed to cover all codon sequences in all possible codon positions equals the number of codon sequences times the number of codon positions.
  • monitoring the abundance of 96 different codon sequences in 4 different positions can be performed in a single run on four 96 wells micro titre plates (as shown in FIG. 5 , panel B) or a single 384 well plate on a suitable instrument.
  • This architecture allows for the decoding of a 8.5 ⁇ 10 7 library of different encoded molecules.
  • Quantification is performed relative to the amount of full-length PCR product obtained in a parallel control reaction on the same input material performed with the two external PCR primers Pr.1+Pr. 2. Theoretically, a similar rate of accumulation of this control amplicon compared to the accumulation of a product utilizing a single codon+sequence specific primer would indicate a 100% dominance of this particular sequence in the position in question.
  • panel A and B employ a Taqman probe strategy
  • other detection systems SYBR green, Molecular Beacons etc.
  • multiplex reactions employing up to 4 different fluorofors in the same reaction could increase throughput correspondingly.
  • the 160 bp products were gel-purified using QlAquick Gel Extraction Kit from QIAGEN (Cat. No. 28706) and quantified on spectrophotometer.
  • 20 ng of each of the identifiers (as estimated from these measurements) were loaded on an agarose gel.
  • Sample B 20 ng/20 ⁇ l stocks of each identifier were prepared. The sample was mixed as follows:
  • Standard curve The samples for the standard curve was prepared by diluting Sample A 116.55-fold to 10 9 copies/5 ⁇ l (0.33 fmol/ ⁇ l) and subsequently performing a 10-fold serial dilution of this sample. 5 ⁇ l was used for each PCR reaction. The standard curve is shown in FIG. 2 .
  • P1-4 CATACTGTGTACGTCAACACGTCAGATA 67.4° C.
  • P1-5 CATACTGTGGAACTACCATCCAAGGATA 68.0° C.
  • P1-6 CCATCCAACATCGTTGGAAGAT 67.8° C.
  • P1-7 CATACAACCTGTCCTGTGAGATCTGATA 67.7° C.
  • P1-8 ATACTCACGAAGCTGGATGATGAGATA 67.3° C.
  • P1-9 CATACTAGCATCGATCGAACGTAGGATA 68.1° C.
  • P1-10 TCATACTCGAAGCTACTGTCGAGATGATA 68.2° C.
  • P2-1 ATATTAGTGTGTGACGATGGTACGCA 67.8° C.
  • P3-1 ACAAGTACGAACGTGCATCAGAGA 67.7° C.
  • P4-1 CGAGCAGGACCTGGAACCT 67.7° C.
  • P4-2 TCGACCACTGCAGGTGGA 68.3° C.
  • P4-3 GCTTCCTCTGCTGCACCA 66.7° C.
  • P4-4 GGTGTCGAGGTGAGCAGCA 69.1° C.
  • P4-5 CGACGAGGTCCATCCTGGT 68.6° C.
  • P4-6 GTGAGGAGCAGGTCCTCCTGT 68.0° C.
  • P4-7 CTGACACTGGTCGTGGTCGA 68.8° C.
  • P4-8 CATCTCGACGACCTGCTCCT 67.9° C.
  • P4-9 ACGAGGTCTCCACTGGTCCA 68.3° C.
  • P4-10 ACTGAGCTGCTCCTCCAGGT 66.5° C.
  • FIG. 6 shows the standard curve calculated by the 7900HT system software.
  • the log of the starting copy number was plotted against the measured C T value.
  • the relationship between C T and starting copy number was linear in the range from 10 to 10 9 identifier copies.
  • the results of the experiments show the possibility of accurately quantification of identifier oligonucleotides down to or even below 10 copies with a 9 fold dynamic range, and reliable relative quantification of the tested codons in various positions in the identifier oligonucleotide.
  • Another possibility to analyse codons in identifier oligonucleotides is to use array format with attached probe oligonucleotides.
  • Adaptor oligonucleotides 3′ CTCATCGGAAGGGCTCGTAA CGG TGGGTTTGGG GGC TGGGTTTGGG G CG TGGGTTTGGG CGG-5′ 3′ TTTGGTAGCTGAGTGCCCTAGGC TGGGTTTGGG CGG TGGGTTTGGG G GC TGGGTTTGGG GCG-5′ 3′ TAACTGGTTTGACGCCACGCGCG TGGGTTTGGG GCG TGGGTTTGGG C GG TGGGTTTGGG GCG -5′ 3′ TAATTGAGCTGACGGCGCACGGC TGGGTTTGGG CG TGGGTTTGGG GC TGGGTTTGGG GCG-5′ 3′ TGTTGCTACTCTGGCCCGAGGC TGGGTTTGGG C TGGGTTTGGG C TGG GTTTGGG GCG-5′ 3′ ACGGGATAACAACGCAGCCTGGC TGGGTTTGGGTGGGTTTGGGTGGG GCG-5′
  • the Adaptor mix (100 pM final concentration for each of the adaptor oligonucleotides) in a hybridization buffer (100 mM MES, 1 M NaCl, 20 mM EDTA, 0,01% Tween 20, 1 ⁇ Denhardt's), was heated to 95° C. for 5 min and subsequently cooled and maintained at 40° C. for 5 min before loading onto the Affymetrix GenFlex probe array cartridge. The probe array was then incubated for 2 h at 45° C. at constant rotation (60 rpm).
  • the remaining Adaptor mix was removed from the GenFlex cartridge, and replaced with the identifier in a hybridization buffer (100 mM MES, 1 M NaCl, 20 mM EDTA, 0,01% Tween 20, 1 ⁇ Denhardt's).
  • the identifier hybridisation mix was heated to 95° C. for 5 min and subsequently cooled and maintained at 40° C. for 5 min before loading onto the Affymetrix GenFlex probe array cartridge and hybridised for 2 h at 45° C. at constant rotation (60 rpm).
  • the washing and staining procedure was performed in the Affymetrix Fluidics Station.
  • the probe array was exposed to 2 washes in 6 ⁇ SSPE-T at 25° C.
  • biotinylated Identifier oligonucleotide was stained with a streptavidin-phycoerythrin conjugate, final concentration 2 ⁇ g/ ⁇ l (Molecular Probes, Eugene, Oreg.) in 6 ⁇ SSPE-T for 10 min at 25° C. followed by 6 washes in 6 ⁇ SSPE-T at 25° C.
  • the Array analysis shows that the codons including the framing regions are able to distinguish between the different probe oligonucleotides.
  • the designed probes will only detect codons with the correct framing region allowing distinguishing first of the right codon and secondly as to which position the codon is positioned. Only one deletion in both framing regions reduces significantly the hybridization of the identifier.
  • the framing sequence may be used to obtain information about the position of a specific codon and the point in the reaction history when a given reaction of a chemical entity has occurred.
  • the information obtained in this example using either QPCR or array codon analysis as example can be used to generate a new more focused library.
  • the signal from the QPCR analysis or the array analysis can directly be used to combine preferable chemical entities.
  • sequence information obtained from a codon analysis performed according to the principles described in Examples 4 or 5 can be utilized for assembly a new more focused library. Sequence information can also be used to design a second-generation library with reduced diversity. This example illustrates how sequence data can be utilized to make a more focused library with the enriched chemical entities. Identical strategy can be based on the codon analysis methods described in Examples 4 or 5.
  • a 700-member library was generated composing of 4 ⁇ 25 ⁇ 7 chemical entities.
  • the library generation protocol is described below with the sequence information and chemical entity structure.
  • each oligo (Ax, Bx, Cx) was used and can be designed by using a specific nucleotide sequence for each chemical entity.
  • two complementary oligonucleotides e.g. oligo Ax and oligo ax
  • oligo Ax and oligo ax two complementary oligonucleotides containing a particular codon are allow to hybridize before the ligation step.
  • the ligation of each codon oligonucleotide in each position is ligated with that attachment of the encoded chemical entity.
  • Pnt corresponds to pentenoyl—an amine protecting group.
  • R can by any molecule fragment.
  • the chemical used in library generation comprise a primary (shown) or a secondary amine.
  • First oligonucleotides of the A series are each modified by adding to each type of oligo a small molecule building block (BBAX) to the 5′ amine forming an amide bond. After this step the identifier is comprised of oligo Ax.
  • BBAX small molecule building block
  • nmol of a mixture of different modified A oligos are then split into a number tubes corresponding to the number of different building blocks to be used in round B.
  • 190 pmol Oligo a and 2 ⁇ l heering DNA is added to each tube and the DNA material in each tube is lyophilized.
  • the lyophilized DNA is then redissolved in 50 ⁇ l water and purified by spining through Biospin P-6 columns (Biorad) equilibrated with water.
  • each tube is again lyophilized and redissolved in 2 ⁇ l 100 mM Naborate pH 8.0/100 mM sulfo N-hydroxy succinimide (sNHS).
  • 10 ⁇ l building block BBBX (100 mM in dimethyl sulfoxide [DMSO]) is preactivated by mixing with 10 ⁇ l 1-Ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC) (90 mM in dimethylformamide [DMF]) and incubating at 30° C. for 30 min.
  • 3 ⁇ l of this preactivated mixture is then mixed with the 2 ⁇ l in each tube and allowed to react 45 min at 30° C.
  • an additional 3 ⁇ l freshly preactivated BB is added and the reaction is allowed to proceed for 45 min at 30° C.
  • the resulting mixture is then purified by spinning through Bio-Rad P6 DG (Desalting gel).
  • the DNA material is then lyophilized and redissolved in 10 ⁇ l water containing 200 pmol oligo Bx (eg. B1) and the corresponding oligo bx (eg. b1). This is done so that the codon in oligo Bx identifies the BBBX added to the DNA identifier.
  • 10 units of T4 DNA ligase (Promega) and 1.2 ⁇ l T4 DNA ligase buffer is then added to each tube and the mixture is incubated at 20° C. for 1 hour.
  • the DNAn identifier linked to the small molecules now comprises an Ax oligo with a Bx oligo ligated to its 3′ end.
  • the reactions are then pooled, an appropiate volume of water is allowed to evaporate and the remaining sample is purified by spining through Biospin P-6 columns (Biorad) equilibrated with water.
  • the pooled sample ( ⁇ 50 ⁇ l) is adjusted to 10 mM Na-acetate (pH 5). 0.25 volumes of 25 mM Iodine in tetrahydrofuran/water (1:1) is added and the sample is incubate at 37° C. for 2 h. The reaction is then quenched by addition of 2 ⁇ l of 1M Na 2 S 2 O 3 and incubation at room temperature for 5 min. The complexes are then purified by spining through Biospin P-6 columns (Biorad) equilibrated with water
  • the sample is adjusted to 50 ⁇ l 100 mM sodium borate pH 8.5 and 20 ⁇ l 1 500 mM 4-methoxy thiophenol (in acetonitrile) is added and the reaction is incubated at 25° C. overnight. Then the complexes are purified by spinning through Biospin P-6 columns (Biorad) equilibrated with water and then lyophilized.
  • the samples are dissolved in 175 ⁇ l 100 mM Na-borate pH 8.0 and distributed into 25 wells (7 ⁇ l/well). 2 ⁇ l 100 mM BB c x in water/DMSO and 1 ⁇ l of 250 mM DMT-MM is added to each reaction and incubated at 30° C. overnight. Water is added to 50 ⁇ l and the reactions are then spin purified using Bio-Rad P6 DG (Desalting gel) and subsequently water is allowed to evaporate so that the final volume is 10 ⁇ l.
  • Bio-Rad P6 DG Desalting gel
  • the DNA material is then lyophilized and redissolved in 10 ⁇ l water containing 200 pmol oligo Cx (eg. C1) and the corresponding oligo cx (eg. c1). This is done so that the codon in oligo Cx corresponds to the BB c x added to the DNA identifier.
  • 10 units of T4 DNA ligase (Promega) and 1.2 ⁇ l T4 DNA ligase buffer is then added to each tube and incubated at 20° C. for 1 hour.
  • the DNAn identifier linked to the small molecules now comprises and Ax oligo with a Bx ligated to its 3′ end and a Cx oligo ligated to the 3′ end of the Bx oligo.
  • the reactions are then pooled, the pooled sample volume is reduced by evaporation and the sample is purified by spining through Biospin P-6 columns (Biorad) equilibrated with water.
  • the pooled sample ( ⁇ 50 ⁇ l) is adjusted to 10 mM Na-acetate (pH 5). 0.25 volumes of 25 mM Iodine in tetrahydrofuran/water (1:1) is added and the sample is incubate at 37° C. for 2 h.
  • the reaction is then quenched by addition of 2 ⁇ l of 1 M Na 2 S 2 O 3 and incubation at RT for 5 min.
  • the DNA identifiers (carrying small molecules) are purified by spinning through Biospin P-6 columns (Biorad) equilibrated with water and then lyophilized.
  • Some building blocks contain methyl esters that are deprotected to acids by dissolving the pooled sample in 5 ⁇ l 20 mM NaOH, heating to 80° C. for 10 minutes and adding 5 ⁇ l of 20 mM HCl.
  • oligo d is extended along the identifier by adding to the sample 10 ⁇ l of 5 ⁇ sequenase EX-buffer [100 mM Hepes, pH 7.5, 50 mM MgCl 2 , 750 mM NaCl] and 4000 pmol oligo d. Annealing is performed by heating to 80° C. and cooling to 20° C. To the sample is then added 500 ⁇ L dNTP, water to 50 ⁇ l and 39 units of Sequenase version 2.0 (USB). The reaction is incubated at 37° C. for 1 hour.
  • This library is subjected to selection, whereby binders to the selection target are enriched.
  • the wells were washed 10 times with blocking buffer and the encoded library was added to the wells after diluting it 100 times with blocking buffer. Following 2 hours incubation at room temperature the wells were washed 10 times with blocking buffer. After the final wash the wells were cleared of wash buffer and subsequently inverted and exposed to UV light at 300-350 nm for 30 seconds using a trans-illuminator set at 70% power. Then 100 ⁇ l blocking buffer without Tween-20 was immediately added to each well, the wells were shaken for 30 seconds, and the solutions containing eluted identifiers were removed for PCR amplification.
  • a TOPO-TA (Invitrogen) ligation reaction is assembled with 4 ⁇ l PCR product, 1 ⁇ l salt solution (Invitrogen) and 1 ⁇ l vector. Water is added to 6 ⁇ l. The reaction is then incubated at RT for 30 min. Heat-shock competent TOP10 E. coli cells are then thawed on ice and 5 ⁇ l of the ligation reaction is added to the thawed cells. The cells are then incubated 30 min on ice, heatshocked in 42° C. water for 30 sec, and then put on ice again. 250 ⁇ l of growth medium is added to the cells and they are incubated 1 h at 37° C. The medium containting cells is then spread on a growth plate containing 100 ⁇ g/ml ampicillin and incubated at 37° C. for 16 hours.
  • E. coli clones are then picked and transferred to PCR wells containing 50 ⁇ l water. These 50 ⁇ l were incubated at 94° C. for 5 minutes and used in a 20 ⁇ l in a 25 ⁇ l PCR reaction with 5 pmol of each TOPO primer M13 forward & M13 reverse and Ready-To-Go PCR beads (Amersham Biosciences). The following PCR profile is used: 94° C. 2 min, then 30 ⁇ (94° C. 4 sec, 50° C. 30 sec, 72° C. 1 min) then 72° C. 10 min.
  • the codons in the identifier oligonucleotides were analysed. Before the analysis, the identifier oligonucleotides were amplified using the constant flanking regions and the amplified material was used in the identifier sequence analysis.
  • a sequence codon analysis of the selected codons showed a bias for specific chemical entities. They are listed in the table below. For instance, in position 1 chemical entity 98 was seem 47 times (out of 51 sequences, 92%, compare to 25% before the selection) and chemical entity 99 was seen 14 times (out 51 sequences,. 27%, compare to 4% before selection) and chemical entity 53 was seen 35 times (out of 51 sequences, 68%, compare to 14% before selection).
  • the new focused library with the selected chemical entities can be selected against the target and the outcome from the selection can be analysed.
  • the most abundant binders will be the combination between the chemical entities 98-99-53 and the second most abundant binder is 98-158-53 as shown below.
  • This example exemplifies the possibility to reduce the library diversity by using the enriched chemical entities in a new library and perform another round of selection on the chosen chemical entities.
  • BB building blocks
  • oligonucleotide oligo
  • Some of the building blocks carried an amine functional group and a carboxylic acid functional group.
  • the building block amine was protected by N-pentenoylation and deprotected by iodine treatment prior to the reaction of the following building block.
  • Oligonucleotide 1 (Oligo1) carried an amine functional group to allow reaction with the building block 1's carboxylic acid and oligonucleotides are optionally derivatized by phosphorylation to allow ligation.
  • Oligonucleotide3 (oligo3) also comprised a primer region for PCR amplification.
  • EDC/NHS, EDC/sulfoNHS or DMTMM was used as coupling reagents.
  • Building block abundances analysis may be done by QPCR or by sequencing full sequences and then analyzing for the abundance of individual building blocks.
  • the overall process leads to molecules of the following structure, where the oligonucleotide was double stranded.
  • the oligonucleotide was made double stranded by the use of double stranded Oligo's 1, 2 and 3 with an overhang to allow ligation of both strands.
  • the identified sequences were then analyzed for the abundances of building blocks at each position in the sequence.
  • the most abundant building blocks at each position from the two libraries 1 and 2 were then used again to generate a new and smaller library of 1,365 members, which was selected for binders of the Integrin ⁇ v ⁇ 3 receptor.
  • the library was generated with 7 different building blocks in position 1, 13 different building blocks in position 2 and 15 different building blocks in position 3.
  • each of the building block numbers identify one specific building block or in two instances (library 1) a mixture of three different building blocks.
  • library 1 a mixture of three different building blocks.
  • the same numbers are used for each building block in all libraries, however the oligonucleotide used to identify each building block may not necessarily be the same between libraries to avoid potential problems of cross contamination.

Abstract

The present invention relates to a method for generating a second-generation library. In a first step, a library of encoded molecules associated with an identifier nucleic acid comprising codons identifying chemical entities that have participated in the formation of the encoded molecule is provided. In a second step, the library is partitioned and encoded molecules having a certain property are selected. Codons of identifiers of selected encoded molecules are subsequently identified, and a second-generation library is prepared using at least some of the chemical entities coded for by the identified codons. The new focused library may be used of another partition step to select encoded molecules with a certain property.

Description

  • Various patent and non-patent references cited in the present application are hereby incorporated by reference in their entirety.
  • TECHNICAL FIELD OF THE INVENTION
  • The invention relates to a method for producing a second-generation compound library with an improved desired property profile. In nature and artificial methods based on the natural system, the parent genotype is carried on to the off-spring and results in a phenotype in which the exact type and sequence of amino acids is retained, unless a mutation and/or recombination has occurred. The present method only retains the identity of chemical entities, e.g. amino acids, while the sequence wholly or partly is scrambled. The result is a focused second-generation library with lower diversity.
  • BACKGROUND OF THE INVENTION
  • The biological evolution is based on the survival of specific genotypes that encode phenotypes with the most suitable functionalities in a certain environment. In all living species DNA programs the genotype. DNA serves two important functions in the natural selection process. One function is obviously to encode for the type of nucleotides used and the other function is to encode for the specific order of nucleotide sequences in a nucleic acid sequence. The strategy used in nature, i.e. encoding for the exact type as well as the precise sequence of nucleotides, ensures an extremely similarity between the progeny and its parents. Thus, conserving almost the exact sequence and type of the nucleotides is absolutely essential in order to create off spring with a high functionality. The changes in the genotype from one generation to another, which allow for evolution, are determined by the random mutation rate and recombination between the two parent's genotypes.
  • The natural selection cannot afford too many changes in the DNA from one generation to the next in order to secure survival of the species. Therefore, nature has evolved sophisticated means to proofread the copying of the DNA from the parents to its progeny and secured that the characteristics of phenotype from one generation to the next is carried only by the DNA.
  • Within the art of selecting ligands from a library of encoded polypeptides associated with a corresponding identifier nucleic acid sequence, the method of nature is used. Thus, when more than a single library generation is needed, the identifier nucleic acid sequences (genotype) carries the information from one generation to the next.
  • WO 93/03172 A1 discloses a method for identifying a polypeptide ligand having a desired property in a polypeptide library. In a first step, a translatable mRNA mixture is provided, which is mixed with a mixture of ribosome complexes to form a translation product attached to the mRNA strand responsible for the formation thereof. In a second step the ribosome complexes binding to a target are partitioned from and remainder of the library. In a third step, an amplification of mRNA strands of the partitioned ribosome complexes, which has bound to the target follows. The amplified mRNA strands are used for the production of a second generation library, which is subjected to a renewed contact with the target. The method is repeated a sufficient number of times until the size of the library has narrowed to a small pool of high affinity binders.
  • In WO98/31700 A1 a method for selecting a DNA molecule, which encodes for a desired protein, is disclosed. The method implies the initial presence of a pool of candidates RNA molecules, which subsequently is translated into a corresponding pool of RNA-protein fusions. Subsequently the mRNA-protein fusion products are subjected to a selection process, i.e. the fusion products are presented for a target molecule, and a new pool of complexes capable of binding to the target are partitioned. From the new pool of complexes, the mRNAs are recovered and amplified for use in a subsequent round of library generation. Xu, L. et al Chemistry & Biology, Vol. 9, 933-942, August 2002 discloses a practical embodiment in which a library of more than 1012 unique mRNA-protein fusion products through ten rounds of library generation and selection are used to identify a high affinity binding protein.
  • The preparation of libraries of synthetic molecules associated with a corresponding identifier nucleic acid sequence, and the selection of synthetic molecules from such libraries, have been the subject of various patent applications. When two or more generations of libraries are needed, the identifier nucleic acid sequence is used as carrier between an initial library and the next generation library.
  • Thus, in WO 00/23458A1 libraries of complexes comprising non-natural molecules attached to corresponding nucleic acid sequences are suggested. After a selection of the library has been conducted, the nucleic acid sequences of successful complexes are amplified by PCR and a new library is prepared from these nucleic acid sequences. The same method of carrying information from an initial library to the next library is applied in WO 02/074929A2 and WO 02/103008A2.
  • The present invention provides a new method for evolving encoded molecules. The method is based on the identification of chemical entities used in the synthesis of reaction products of successful complexes and the application, at least in part, of these chemical entities in the preparation of the next generation library. The utilization of preferable chemical entities and the exclusion of certain undesired chemical entities in the next library generation generally imply that the next generation library has a smaller size compared to the size of the initial library, thereby, at the same time, retaining the desirable encoded molecules in the library.
  • SUMMARY OF THE INVENTION
  • The present invention concerns a method for producing a composition of molecules with an improved desired property, said method comprising the steps of: providing an initial library comprising a plurality of different encoded molecules associated with a corresponding identifier nucleic acid sequence, wherein each encoded molecule comprises a reaction product of multiple chemical entities and the identifier nucleic acid sequence comprises codons identifying said chemical entities; subjecting the initial library to a condition partitioning members having encoded molecules displaying a predetermined property from the remainder of the initial library; identifying codons of the identifier nucleic acid sequences of the partitioned members of the initial library; and preparing a second-generation library of encoded molecules using the chemical entities coded for by the codons of the partitioned members of the initial library or a part thereof.
  • The present invention relates to a novel approach to perform evolution of molecules with a desired property, said approach being different from the approach of nature and the prior art. The invention is based on the selecting of chemical entities, the counterpart of amino acids in Nature, instead of the precise sequence of chemical entities. This new approach is powerful in ex vivo conditions when high functionality of the off spring is not vital for success and when the number of chemical entities relative to the number of reactants used in each encoded molecule is high.
  • The method disclosed herein will be increasingly effective as the library size increases. This is due to the fact that more chemical entities is used when a library size is increased, when the number of reactions for the formation of the encoded is fixed and the fact that different chemical entities tend to be involved in encoded molecules having the desired property. The chemical entities, which are part of the final selected molecules, will be enriched in each round of selection. Finally, when the diversity has been extensively reduced, the enriched molecules are decoded from the identifier nucleic acid sequence comprising the codons of the chemical entities that have participated in the formation of the encoded molecule.
  • The strategy of performing enrichment of chemical entities instead of specific combinations of chemical entities more efficiently search the chemical space for all combinations of chemical entities that are eager to show a certain property, such as a binding ability towards a target. Thus, chemical entities having a certain impact on the formation of encoded molecules is allowed in a new library to recombine in each new library generation. In a certain aspect of the invention, the recombination is random, i.e. once a chemical entity has qualified as being of interest it is allowed in every position of the reaction sequence. In another aspect of the invention, the recombination is semi-random, i.e. once a chemical entity is qualified as being of interest it is used in a certain position in the reaction sequence of the encoded molecule. In still a further aspect of the invention, the amount of the chemical entity used in a subsequent library generation is dependent on the frequency and the amount of the partitioned library members.
  • The present invention may be of special interest when a group of chemical entities are selected from a larger pool of chemical entities in the formation of a first library. Selecting chemical entities resulting in encoded molecules having a certain property in a first library and spiking with remaining chemical entities of the pool allows for the formation of a second-generation library not necessarily of a smaller size but enriched in encoded molecules having a certain property.
  • The second-generation library may be formed of a reaction product of the chemical entities without attaching the reaction product to a nucleic acid. In an embodiment of such second-generation library the individual reaction products are formed in discrete reaction compartments in accordance with traditional combi-chem technology. In a certain aspect of the invention, the second-generation library is prepared as the first generation library, i.e. the second-generation library comprises a plurality of different encoded molecules associated with a corresponding identifier nucleic acid sequence, wherein each encoded molecule comprises a reaction product of multiple chemical entities and the identifier nucleic acid sequence comprises codons identifying said chemical entities.
  • In a preferred aspect of the invention, it comprises subjecting the second-generation library to a condition partitioning members having encoded molecules displaying a predetermined property from the remainder of the second-generation library. The second-generation library may be partitioned as to the same property or a different property. Notably, the second-generation library can be screening against the same target or a different target.
  • After the partitioning of the second-generation library, the invention comprises the step of deducing the identity of the encoded molecule(s) using the identifier nucleic acid sequence, when present. Optionally, a third or further generation library may be formed and screened before the final deducing step is performed. In a certain embodiment, the decoding includes that the codons of the identifier nucleic acid sequence is decoded to establish the synthesis history of the encoded molecules. The synthesis history includes the identity of the chemical entities used and the point in time they enter the sequence of reactions resulting in the encoded molecule.
  • The encoded molecule is preferably a reaction product in which multiple chemical entity precursors have participated. The encoded molecule may have any chemical structure. Generally, the multiple chemical entities are precursors for a structural unit appearing in the encoded molecule. However, the chemical entities may also perform a chemical reaction with the nascent encoded molecule, which result in an altering or removal of chemical groups. In certain aspects of the invention, the encoded molecule is a scaffolded molecule, i.e. various chemical entities have reacted with a chemical core structure like steroid, benzodiazepine, retinol, camphor, ephedrine, penicillin, cannabinol, coumarin, oxazol, etc. In certain other aspects of the invention the encoded molecule is fully or partly a polymer. The polymer may be of a type which occurs naturally or may be a non-naturally occurring polymer. Nature only has the possibility of preparing α-polypeptides using the recognition of a codon of an mRNA strand by the anticodon of a charged tRNA. In some aspects of the invention, the encoded molecule is not a α-peptide. Notably, in some aspects of the invention, the chemical entities are reacted without enzymatic interaction to produce the encoded molecule.
  • The encoded molecule can be associated with the nucleic acid sequence identifier in any appropriate way. In a certain aspect of the invention, the encoded molecule associated with the corresponding identifier nucleic acid sequence is a bifunctional complex. The bifunctional complex may be formed by covalent or non-covalent attachment of the encoded molecule to the identifier nucleic acid sequence. In another aspect of the invention, an identifier nucleic acid sequence is physically a distinct entity separated from the encoded molecule, wherein the identifier identifies the spatial position of an encoded molecule, e.g. in the same compartment in which an encoded molecule is formed a corresponding identifier oligonucleotide is generated.
  • The conditions partitioning complexes of interests from the remainder of the library may be chosen from a variety of possibilities. In one aspect the condition relates to physical parameters, so that complexes displaying a physical stability under e.g. certain temperature conditions, certain acidic conditions, certain radiation conditions etc. are selected from the library. In other aspects of the invention the condition for partitioning the desired complexes includes subjecting the initial library to a molecular target and partitioning complexes binding to this target. The molecular target may be any compound of interest. Exemplary targets are proteins, carbohydrates, polysaccharides, hormones, receptors, antibodies, viruses, antigens, cells, tissues etc. In certain aspects the target is immobilized on a solid support, such as column material and contacted with the candidate complexes in a fluid media followed by a partitioning of the complexes capable of binding to the target under the contacting conditions used. Typically the binding complexes are eluted from the column using increased stringency conditions.
  • The complexes as such or only the identifier part is harvested after the partitioning step. Usually the identifier nucleic acid sequences are amplified prior to the identification step. The amplification is suitably performed applying polymerise chain reaction (PCR). The amplified identifiers may be explicitly or implicitly identified. When the codons are identified explicitly, the sequence and identity of nucleotides in the codon is made known to the experimenter, whereas, when the codons of the identifiers are implicitly identified, the experimenter is not presented for the information.
  • Any suitable method for identifying codons may be used. In a certain aspect of the invention, traditional sequencing, e.g. by using a modification of the Sangers method or pyrosequencing methods, identifies the codons. In another aspect of the invention, the codons of the identifier nucleic acid sequences of the partitioned members of the initial library are identified by contacting said identifier nucleic acid sequences with a pool of nucleic acid fragments under conditions allowing for hybridisation.
  • The pool of nucleic acid fragments may be immobilized or in solution. In a certain aspect of the invention, the pool of nucleic acid fragments comprises a plurality of single stranded nucleic acid probes immobilized in discrete areas of a solid support, wherein the nucleic acid probes are capable of hybridising to a codon of the identifier nucleic acid sequence comprising codons. The nucleic cid probes may be positioned on a microarray, such that the identity of the codons is revealed by observing the discrete areas of the support in which a hybridisation event has occurred.
  • The nucleic acid probe can be directly hybridised to the identifier or the nucleic acid probe of the array is hybridised to an identifier nucleic acid sequence through an adapter oligonucleotide having a sequence complementing the probe as well as one or more codons of the identifier nucleic acid sequence. The probe may identify a single codon of an identifier or a probe of the array is capable of hybridising to two codons of the identifier nucleic acid sequence or a sequence complementary to said sequence. The ability to hybridise two or more codons makes it possible to study the influences of neighbouring chemical entities on each other. In a certain aspect, a nucleic acid probe of the array is capable of hybridising to all codons of an identifier nucleic acid sequence. This latter option will fully decode the identity of the encoded molecule. Usually however, a fully decoding is only possible for a relative small library size, as it presupposes a nucleic acid probe for each member of the library.
  • When single codons are detected, useful information about a certain codon may be gathered by detecting the codon together with a framing sequence identifying the position in the reaction history of the chemical entity corresponding to said codon.
  • As an example, if a library of complexes is prepared from 100 chemical entities and the three reactions, i.e. each identifier comprises 4 codons, the library size is 108. For most practical uses 108 is in the excess of what is possible to detect on an array, especially if multiple determinations for each identifier are considered necessary to obtain a high accuracy. However, an array of just 100 probes complementary to the 100 codons will reveal important information prior to or subsequent to a selection. In the event a framing sequence is detected together with the codon an array of 400 probes is needed.
  • A suitable method for identifying an hybridisation event is to use a label. Therefore, in a preferred embodiment, the existence of a hybridisation event is measured through labelling of the identifier nucleic acid sequence, or an amplification product thereof. When the label emits light, the hybridisation event is measured by the emission of light in a scanner. To reveal the relative abundance of each chemical entity in the library of encoded molecules, the relative intensity of light in each discrete spot is measured.
  • The measurement of a hybridisation event may be conducted by various methods known in the art. In the event the label emits lights, the presence or absence of a hybridisation event may be measured in a scanner, e.g. a confocal scanner. The scanner may be connected with computer software, which is able to quantify the amount of lights measured. The amount of light measured correlates with the amount of identifier annealed to the probes. Thus, it is possible to measure not only the presence or absence of one or more codons of an identifier; it is also possible to measure the relative amount of the codons in one or more identifiers.
  • After the complexes have been partitioned and the specific codons have been identified on the microarray, the information can be used to design optimized libraries including chemical entities based on both the selection data and the chemical structure. The microarray analysis will first of all detect which chemical entities pass the partitioning step. Secondly, the relative intensity on the microarray will reflect the relative binding affinity of the chemical entities. Finally, the structures of the chemical entities are directly identified due to the position of the probes on the array. For instance, chemical entities that are strongly selected in a partitioning process but possess some unfavourable chemical structure can be excluded in the next generation of library. Similarly, chemical entities that are weekly selected in a partitioning process but possess some favourable chemical structure can be included in the next generation of library. Thus, the next generation library design can be based both on a rational choice of chemical entities with lead-like structures and the selection pressure detected on the microarray.
  • Another method of identifying codons includes that nucleic acid fragments are primer oligonucleotides, and the identification involves subjecting the hybridisation complex between the primer oligonucleotides and the identifier nucleic acid sequences to a condition allowing for an extension reaction to occur when the primer is sufficient complementary to a part of the identifier nucleic acid sequence, and evaluating based on measurement of the extension reaction, the presence, absence, or relative abundance of one or more codons.
  • The extension reaction requires a primer, a polymerase as well as a collection of deoxyribonucleotide triphosphates (abbreviated dNTP's herein) to proceed. An extension product may be obtained in the event the primer is sufficient complementary to an identifier oligonucleotide for a polymerase to recognise the double helix as a substrate. After binding of the polymerase to the double helix, the deoxyribonucleotide triphosphates (blend of DATP, dCTP, dGTP, and dTTP) are incorporated into the extension product using the identifier oligonucleotide as identifier. The conditions allowing for the extension reaction to occur usually includes a suitable buffer. The buffer may be any aqueous or organic solvent or mixture of solvents in which the polymerase has a sufficient activity. To facilitate the extension process the polymerase and the mixture of dNTP's are generally included in a buffer which is added to the identifier oligonucleotide and primer mixture. An exemplary kit comprising the polymerase and the nNTP's for performing the extension process comprises the following: 50 mM KCl; 10 mM Tris-HCl at pH 8.3; 1.5 mM MgCl2; 0.001% (wt/vol) gelatin, 200 μM DATP; 200 μM DTTP; 200 μM dCTP; 200 μM dGTP; and 2.5 units Thermus aquaticus (Taq) DNA polymerase I (U.S. Pat. No. 4,889,818) per 100 microliters (μl) of buffer.
  • The primer may be selected to be complementary to one or more codons or parts of such codons. The length of the primers may be determined by the length of the codons, however, the primers usually are at least about 11 nucleotides in length, more preferred at least 15 nucleotides in length to allow for an efficient extension by the polymerase. The presence or absence of one or more codons is indicated by the presence of or absence of an extension product. The extension product may be measured by any suitable method, such as size fractioning on an agarose gel and staining with ethidium bromide.
  • In a preferred embodiment the admixture of identifier oligonucleotide and primer is termocycled to obtain a sufficient number of copies of the extension product. The thermocycling is typically carried out by repeatedly increasing and decreasing the temperature of the mixture within a temperature range whose lower limit is about 30 degrees Celsius (30° C.) to about 55° C. and whose upper limit is about 90° C. to about 100° C. The increasing and decreasing can be continuous, but is preferably phasic with time periods of relative temperature stability at each of temperatures favouring polynucleotide synthesis, denaturation and hybridization.
  • When a single complex is analysed in accordance with the present method, the result may be used to verify the presence or absence of a specific chemical entity during the formation of the display molecule. The formation of an extension product is indicative of the presence of an oligonucleotide part complementary to the primer in the identifier oligonucleotide. Conversely, the absence of an extension product is indicative of the absence of an oligonucleotide part complementary to the primer in the identifier oligonucleotide. Selecting the sequence of the primer such that it is complementary to one or more codons will therefore provide information of the structure of the encoded molecule coded for by this codon(s).
  • In a preferred aspect of the invention, in the mixture of the identifier oligonucleotide and the primer oligonucleotide, a second primer complementary to a sequence of the extension product is included. The second primer is also termed reverse primer and ensures an exponential increase of the number of produced extension products. The method using a forward and reverse primer is well known to skilled person in the art and is generally referred to as polymerase chain reaction (abbreviated PCR) in the present application with claims. In one embodiment of the invention the reverse primer is annealed to a part of the extension product downstream, i.e. near the 3′end of the extension product, or a part complementing the coding part of the identifier oligonucleotide. In another embodiment, the first primer (forward primer) anneals to an upstream position of the identifier oligonucleotide, preferably before the coding part, and the reverse primer anneals to a sequence of the extension product complementing one or more codons or parts thereof.
  • The amplicons resulting from the PCR process may be stained during or following the reaction to ease the detection. A staining after the PCR process may be prepared with e.g. ethidium bromide or a similar staining agent. As an example, amplicons from the PCR process is run on an agarose gel and subsequently stained with ethidium bromide. Under UV illumination bands of amplicons becomes visible. It is possible to incorporate the staining agent in the agarose gel or to allow a solution of the staining agent to migrate through the gel. The amplicons may also be stained during the PCR process by an intercalating agent, like CYBR. In presence of the intercalating agent while the amplification proceeds it will incorporate in the double helix. The intercalation agent may then be made visible by irradiation by a suitable source.
  • The intensity of the staining is informative of the relative abundance of a specific amplicon. Thus, it is possible to quantify the occurrence of a codon in an identifier oligonucleotide. When a library of bifunctional complexes has been subjected to a selection the codons in the pool of identifier oligonucleotides which has been selected can be quantified using this method. As an example a sample of the selected identifier oligonucleotides is subjected to various PCR amplifications with different primers in separate compartments and the PCR product of each compartment is analysed by electrophoresis in the presence of ethidium bromide. The bands that appear can be quantified by a densitometric analysis after irradiation by ultraviolet light and the relative abundance of the codons can be measured.
  • Alternatively, the primers may be labelled with a suitable small molecule, like biotin or digoxigenin. A PCR-ELISA analysis may subsequently be performed based on the amplicons comprising the small molecule. A preferred method includes the application of a solid support covered with streptavidin or avidin when biotin is used as label and anti-digoxigenin when digoxigenin is used as the label. Once captured, the amplicons can be detected using an enzyme-labelled avidin or anti-dixigenin reporter molecule similar to a standard ELISA format.
  • To avoid laborious post-PCR handling steps required to evaluate the amplicons, it is in a certain embodiment preferred to measure the extension process “real time”. Several real time PCR processes has been developed and all the suitable real time PCR process available to the skilled person in the art can be used in the evaluating step of the present invention and are include in the present scope of protection. The PCR reactions discussed below are of particular interest.
  • The monitoring of accumulating amplicons in real time has been made possible by labelling of primers, probes, or amplicons with fluorogenic molecules. The real time PCR amplification is usually performed with a speed faster than the conventional PCR, mainly due to reduced cycles time and the use of sensitive methods for detection of emissions from the fluorogenic labels. The most commonly used fluorogenic oligoprobes rely upon fluorescent resonance energy transfer (FRET) between fluorogenic labels or between one flourophor and a dark or “black-hole” nonfluorescent quencher (NFQ), which disperse energy as heat rather than fluorescence. FRET is a spectroscopic process by which energy is passed between molecules separated by 10-100 Å that have overlapping emission and absorption spectra. An advantage of many real time PCR methods is that they can be carried out in a closed system, i.e. a system which does not need to be opened to examine the result of the PCR. A closed system implies a reduced result turnaround, minimisaton of the potential for carry-over contamination and the ability to closely scrutinise the essay's performance.
  • The real time PCR methods currently available to the skilled person can be classified into either amplicon sequence specific or non-specific methods. The basis for the non-specific detection methods is a DNA-binding fluorogenic molecule. Included in this class are the earliest and simplest approaches to real time PCR. Ethidium bromide, YO-PRO1, and SYBR® green 1 all fluorescence when associated with double stranded DNA which is exposed to a suitable wavelength of light. This approach requires the fluorescent agent to be present during the PCR process and provides for a real time detection of the fluorescent agent as it is incorporated into the double stranded helix.
  • The amplicons sequence specific methods includes, but are not limited to, the TaqMan®, hairpin, LightCycler®, Sunrise®, and Scorpions methods. The LightCycler® method also designated “HybProbes” make use of a pair of adjacent, fluorogenic hybridisation oligonucleotide probes. A first, usually the upstream oligoprobe is labelled with a 3′ donor fluorophore and the second, usually the downstream probe is commonly labelled with either a Light cycler Red 640 or Red 705 acceptor fluorophore a the 5′ terminus so that when both oligoprobes are hybridised the two fluorophores are located in close proximity, such as within 10 nm, of each other. The close proximity provides for the emission of a fluorescence when irradiated with a suitable light source, such a blue diode in case of the LightCycler®. The region for annealing of the probes may be any suitable position that does not interfere with the primer annealing. In a suitable setup, the site for binding the probes are positioned downstream of the codon region on the identifier oligonucleotide. Alternatively, when a reverse primer is used, the region for annealing the probes may be at the 3′ end of the strand complementing the identifier oligonucleotide. Another embodiment of the LightCycler method includes that the pair of oligonucleotide probes are annealed to one or more codons and primer sites exterior to the coding part of the identifier oligonucleotide are used for PCR amplification.
  • The TaqMan® method, also referred to as the 5′ nuclease or hydrolysis method, requires an oligoprobe, which is attached to a reporter flourophor, such as 6-carboxy-fluoroscein, and a quencher fluorophore, such as 6-carboxy-tetramethyl-rhodamine, at each end. When in close proximity, i.e. annealed to an identifier oligonucleotide, or a sequence complementing the identifier oligonucleotide, the quencher will “hijack” the emissions that have resulted from the excitation of the reporter. As the polymerase progresses along the relevant strand, it displaces and the hydrolyses the oligoprobe via its 5′→3′ endonuclease activity. Once the reporter is removed from the extinguishing influence of the quencher, it is able to release excitation energy at a wavelength that can be monitored by a suitable instrument, such as ABI Prism® 7700. The fractional cycle number at which the real-time fluorescence signal mirrors progression of the reaction above the background noise is normally used as an indicator of successful identifier oligonucleotide amplification. This threshold cycle (CT) is defined as the PCR cycle in which the gain in fluorescence generated by the accumulating amplicons exceeds 10 standard deviations of the mean base line fluorescence. The CT is proportional to the number of identifier oligonucleotide copies present in the sample. The TaqMan probe is usually designed to hybridise at a position downstream of a primer binding site, be it a forward or a reverse primer. When the primer is designed to anneal to one or more codons of the identifier oligonucleotide, the presence of these one or more codons is indicated by the emittance of light. Furthermore, the quantity of the identifier oligonucleotides comprising the one or more codons may be measured by the CT value.
  • The Hairpin method involves an oligoprobe, in which a fluorophore and a quencher are positioned at the termini. The labels are hold in close proximity by distal stem regions of homologous base pairing deliberately designed to create a hairpin structure which result in quenching either by FRET or a direct energy transfer by a collisional mechanism due to the intimate proximity of the labels. When direct energy transfer by a collision mechanism is used the quencher is usually different from the FRET mechanism, and is suitably 4-(4′-dimethylamino-phenylazo)-benzene (DAB-CYL). In the presence of a complementary sequence, usually downstream of a primer, or within the bounds of the primer binding sides in case of more than one a single primer, the oligoprobe will hybridise, shifting into an open configuration. The fluorophore is now spatially removed from the quencher's influence and fluorescence emissions are monitored during each cycle. In a certain aspect, the hairpin probe may be designed to anneal to a codon in order to detect this codon if present on the identifier oligonucleotide. This embodiment may be suitable if codons only differs from each other with a single or a few nucleotides, because is in well-known that the occurrence of a mismatch between a hairpin oligoprobe and its target sequence has a greater destabilising effect on the duplex than the introduction of an equivalent mismatch between the target oligonucleotide and a linear oligoprobe. This is probably because the hairpin structure provides a highly stable alternate conformation.
  • The Sunrise and Scorpion methods are similar in concept to the hairpin oligoprobe, except that the label becomes irreversible incorporated in to the PCR product. The Sunrise method involves a primer (commercially available as Amplifluor™ hairpin primers) comprising a 5′ fluorophore and a quencher, e.g. DABCYL. The labels are separated by complementary stretches of sequence that create a stem when the sunrise primer is closed. At the 3′ terminus is a target specific primer sequence. In a preferred embodiment the target sequence is a codon, optionally more codons. The sunrise primer's sequence is intended to be duplicated by the nascent complementary stand and, in this way, the stem is destabilised, the two fluorophores are held apart, usually between 15 and 25 nucleotides, and the fluorophore is free to emit its excitation energy for monitoring. The Scorpion primer resembles the sunrise primer, but derivate in having a moiety that blocks duplication on the signalling portion of the scorpion primer. The blocking moiety is typically hexethylene glycol. In addition to the difference in structure, the function of the scorpion primers differs slightly in that the 5′ region of the oligonucleotide is designed to hybridise to a complementary region within the amplicons. In a certain embodiment the complementary region is a codon on the identifier oligonucleotide. The hybridisation forces the labels apart disrupting the hairpin and permitting emission in the same way as the hairpin probes.
  • After the selection has been performed the codon profile is indicative of the chemical entities that have been used in the synthesis of encoded molecules having a certain property, such as an affinity towards a target. In the event the selection has been sufficient effective it may be possible directly to deduce a part or the entire structure of encoded molecules with the desired property. Alternatively, it may be possible to deduce a structural unit appearing more frequently among the encoded molecules after the selection, which gives important information to the structure-activity-relationship (SAR). If the selection process has not narrowed the size of the library to a manageable number, the formation of a second-generation library is useful. In the formation of the second-generation library chemical entities, which have not been involved in the synthesis of encoded molecules that have been successful in the selection may be omitted, thus limiting the size of the new library and at the same time increasing the concentration of complexes with the requested property, e.g. the ability to bind to a target. The second-generation library may then be subjected to more stringent selection conditions to allow only the encoded molecules with a higher affinity to bind to the target. The second-generation library may also be generated using the chemical entities coded for in addition to certain chemical entities suspected of increasing the performance of the final encoded molecule. The indication of certain successful chemical entities may be obtained from the SAR. The use in a second-generation library of chemical entities, which have proved to be interesting for further investigation in a preceding library, may thus entail a shuffling with new chemical entities that may focus the second-generation library in a certain desired direction.
  • An Example of implicit identification of codons includes that the nucleic acid fragment is associated with a chemical entity precursor capable of being transferred to a recipient reactive group. The recipient reactive group may be a part of a chemical scaffold and the chemical entity precursor may add a structural unit to said scaffold. It is preferred that the nucleic acid fragment codes for the chemical entity. In some aspects of the present invention each member of the nucleic acid fragment pool comprises an anticodon, which identifies the chemical entity. When a plurality of chemical entities are present the anticodon is preferably unique, i.e. a unique correspondence between the chemical entities and the associated anticodons exists.
  • The identifier nucleic acid sequence comprises codons, which may be able to pair with one or more anticodons of the pool of nucleic acid fragments. The pairing between one or more codons of an identifier nucleic acid sequence and one or more anticodons is preferably specific, i.e. the one or more codons of the identifier nucleic acid sequence are only recognized by particular anticodons. The nucleic acid fragment containing more than one anticodon can encode for scaffold molecules where each anticodon encodes for specific chemical entities of that scaffold molecule. The specific pairing makes it possible implicitly to decode the codon of an identifier nucleic acid sequence. In the method according to the invention, non-specific pairing between codons and anticodons can be cleaved with an enzyme or chemically treated to break the double stranded nucleotides. The non-pairing region can be cleaved using enzymes that cleaves specifically nucleotide sequences with mismatches. Notably, the enzyme is selected from T4 endonuclease VII, T4 endonuclease I, CEL I, nuclease S1, or variants thereof. The cleavage is preferable used when more than one codon and anticodon is involved in pairing between the identifier nucleic acid sequence and the nucleic acid fragment.
  • The pool of nucleic acid fragments associated with a chemical entity may comprise anticodons complemented by codons of one or more identifier nucleic acid sequence as well as anticodons which are not complemented by codons on any identifier nucleic acid sequence. In other words, the amount of genetic information contained in the anticodons of the pool is larger than the amount of genetic information complemented by the codons.
  • The contacting of the one or more identifier nucleic acid sequences with the pool of nucleic acid fragments are usually conducted at conditions, which allow for hybridisation, i.e. conditions at which cognate nucleic acid sequences can anneal to each other. To facilitate the recovery of nucleic acid fragments, which have annealed to the identifier nucleic acid sequences, the identifier nucleic acid sequences are usually immobilized on a solid support. Examples of suitable solid supports include beads and column material, e.g. beads and column material associated with a second part of the affinity pair to bind identifier nucleic acid sequences attached to the first part of the molecular affinity pair. In certain aspects of the invention the solid support is associated with streptavidin and the identifier nucleic acid sequences are attached to biotin.
  • When the identifier nucleic an acid sequences are immobilized on a solid support the pool of nucleic acid fragments is typically present in a mobile phase, i.e. dissolved in a liquid. The identifier nucleic acids will hybridise to these nucleic acid fragments in the pool which are sufficient complementary to a particular part of an identifier nucleic acid sequence for a binding to occur. Fragments not finding any complementing sequence will remain in the solution. In the event, the identifier nucleic acid sequences are segregated into codons and the fragments comprises anticodons, the anticodons which are able to anneal to a codons will be caught while fragments not having a cognate codon will be maintained in the mobile phase. When codons and anticodons are present in the method of the present invention, specific hybridisation implies that the tendency of an anticodon to cross-hybridise to another codon will be impede or avoided. To avoid cross-hybridisation, codons may be designed such that each codon is distinguished from all other codons be one, two or more mismatching nucleotides.
  • The mobile phase is subsequently separated from the solid phase e.g. by washing, and the enriched pool of fragments is recovered. The recovery of the nucleic acid fragments are usually done by subjecting the hybrid to denaturing conditions, i.e. conditions which separate the two strands. If the parent nucleic acid sequences are immobilized on beads, the separation of the fragments can be effected using denaturing conditions and centrifugation/spinning.
  • The enriched pool of nucleic acid fragments associated with a chemical entity may be used directly to prepare a next generation library of complexes, in which each member of the library comprises an encoded molecule and the nucleic acid sequence which codes for this molecule. In one embodiment of the invention, building blocks comprising a particular transferable chemical entity associated with an anticodon corresponding to the anticodons of the detected fragments are used in the generation of the next generation library. In another embodiment, additional building blocks are added having modified transferable chemical entities in order to improve on a certain property of the encoded molecule.
  • The complexes may be prepared by various known methods starting from the nucleic acid fragment comprising the anticodon and the chemical entity, as disclosed above. According to a particular method, the next generation library is formed by a) mixing under hybridisation conditions, nascent bifunctional complexes comprising a chemical entity or a reaction product of chemical entities, and an identifier nucleic acid sequence comprising codon(s) identifying said chemical entities, with the recovered nucleic acid fragments, said fragments comprising an oligonucleotide sufficient complementary to at least a part of the identifier nucleic acid sequence to allow for hybridisation, a transferable chemical entity and an anticodon identifying the chemical entity, to form hybridisation products; and b) transferring the chemical entities of the nucleic acid fragments to the nascent bifunctional complexes through a reaction involving a reactive group of the nascent bifunctional complex, in conjunction with a transfer of the genetic information of the anticodon.
  • Preferably, the above method for preparing the next generation library comprises the further step of c) separating the components of the hybridisation product and recovering the complexes. If further chemical entities are intended to participate in the formation of the encoded molecule of the nascent complex, steps a) through c) are repeated as appropriate using the recovered complexes in step c) as the nascent bifunctional complexes in step a) of the next round.
  • The genetic information of the anticodon may be transferred to the nascent complex by a variety of methods. According to a first embodiment the genetic information of the anticodon is transferred by enzymatically extending the oligonucleotide identifier region to obtain a codon attached to the bifunctional complex having received the chemical entity. A second embodiment implies that genetic information of the anticodon is transferred to the nascent complexes by hybridisation to a cognate codon of the nascent complex.
  • According to the first embodiment, the enriched pool of fragments comprises an affinity oligonucleotide sufficient complementary to an identifier region of the nascent complex, said oligonucleotide being distinct from the anticodon. Accordingly, the oligonucleotide identifier region of the nascent complex anneals to the affinity oligonucleotide of the building block to form the hybridisation product, while the anticodon remains single stranded. Subsequently, the chemical entity is transferred to the recipient reactive group of the complex to form the encoded molecule prior to, simultaneously with, or subsequent to the enzymatically extension of the hybridisation product using the anticodon as identifier. Specific examples of suitable enzymes are polymerases and ligases, which requires dNTPs and oligonucleotides, respectively as substrates. The method for forming the complexes according to this first embodiment is the subject PCT/DK03/00739, the content thereof being incorporated herein by reference.
  • According to the second embodiment, the anticodon form part of the affinity oligonucleotide, i.e. the anticodon is a part of or the entire affinity oligonucleotide. Initially, a plurality of identifiers comprising different codons and/or different order of codons is provided. The identifiers are associated with a recipient reactive group, i.e. the reactive group may be covalently attached to the identifier or attached by hybridisation. Notably, a codon of the identifier may be used for the attachment of a building block harbouring the reactive group. The identifiers are subsequently contacted with the enriched pool of building blocks, i.e. nucleic acid fragments associated with a transferable chemical entity. The mixture of identifiers and building blocks are maintained at hybridisation conditions to anneal the anticodon of the building blocks to the cognate codon of the identifier. After or simultaneously with the annealing step, the chemical entity is transferred to the recipient reactive group of the identifier. The method for forming the complexes according to the second embodiment is the subject of various patent applications, including WO 02/103008, WO 02/074929, Danish patent application No. PA 2002 01347, and U.S. provisional patent application No. 60/409,968. The content of these patent applications are incorporated herein by reference in their entirety.
  • The new generation of library complexes may be used in a partition step, in which the library of complexes is subjected to a condition partitioning complexes displaying a predetermined property from the remainder of the next generation library, as explained above. Thus, using the present method, it is possible to repeat the partitioning procedure a desired number of times using still more stringent conditions, until a single or a few encoded molecules are identified which display the desired property to a high extent. When the partitioning is based on an affinity assay, the library of encoded molecules are increasingly narrowed in size from one generation to the next and at the same time the high affinity binders are increased in concentration.
  • The outcome of a codon analysis will be dependent of the enrichment factor in the selection process. An efficient and specific selection will generate a large difference between the specific binders compared to the background. Still, there will be a large amount of molecules in the background that will reduce the possibility to obtain measurable differences between the binders and the background in the codon analysis procedure. If the enrichment factor (or too large library) is not good enough to distinguish a specific binder among the background binders, the signal in the codon analysis will probably not be detectable. However, there will be a continuing of binders that use a certain chemical entity in a certain position. These “non optimal” binders (a certain important chemical entity in one position and less important in the other position) will be many due to the diversity obtained when only one (or a few) positions are important in the selection process. Therefore, the sum of all molecules with a preferable chemical entity in a certain position will be larger than the sum of all molecules with a non-binding chemical entity, which will make the codon analysis easier.
  • This invention may involve an extensive analysis of all the chemical entities in a library and how they are involved in the binding to targets. This information can be used both to design new libraries and in the final process where the lead structures are produced and pre-clinical candidates are picked. The extensive data obtained in the codon analysis can for instance be used for selecting candidates with the appropriate specificity. This can be done if selection has been performed on a family of proteins where one of the members is the target.
  • The invention enables pharmacophore identification and transformation into small molecule drugs. In cases where peptide-like libraries is used, the peptide/petdomimetic lead to small molecule conversion process is supported by medicinal chemistry and cheminformatics and guided by matching the pharmacophore derived from massive structure activity relationship (SAR) data information from the codon analysis. A “pharmacophore” is a description of the structural criteria a molecule must fulfil in order that it is active against a specified biological receptor. These criteria are usually the 3D spatial relationships of a set of chemical features, and sometimes include the steric boundaries, within which the molecule must fit. There is a set of software methods, which automatically infers such pharmacophores, given a SAR, in the absence of direct macromolecular structural data.
  • The extensive SAR information obtained using the codon analyses described in this invention can be combined with molecular modeling technologies to refine for example pharmacophore models and the plausible interactions between the potential binders and a target.
  • The codon analysis is also a valuable experimental tool for SAR on weak binders. The codon analysis measures the abundance of chemical entities after a selection in all binding molecules. Thus, even week binders, which there might be many of, is detected even though the detected codon is selected in many different combinations. The selection procedure can also be tuned to enrich predominately for weak binders, which will simplify the codon analysis data.
  • This invention is also suitable for replacing the laborious task of extracting SAR information by hand with an automated process using suitable algorithm and software programs. The codon analysis (e.g. array or QPCR measurements) can be directly feed into a data handling software program that use both the codon abundances and structural data to generate SAR information and potential pharmacophore models.
  • The SAR information and potential pharmacophore models obtained from the codon analysis can be used to design focused libraries in an array format allowing massive and parallel testing. Thus, the selection procedure and codon analysis can be seen as a diversity reduction step to allow a complete test of potential binders in an array format.
  • Various methods for identifying the codons of the identifiers of step iii) are disclosed herein. When a pool of partitioned identifier nucleic acid sequences is subjected to the identification step it is normally not practically to decode a sufficient number of sequences comprising the entire “genome” of an encoded molecule to ensure that all interesting encoded molecules have been revealed. Therefore, a modified sequencing technique preferably identifies the codons in each position occurring with the highest frequency. The next generation library is then build using in each position the chemical entities occurring with the highest frequency.
  • In a certain embodiment of the invention, the codon identification step uses the entire population of identifier nucleic acid sequences in the analysis and informs the experimenter of the relative abundance of each codon in a certain position. The codon information may be obtained using microarray, QPCR, or any equivalent method for revealing the identity of codons. In contrary, sequencing a subset of identifier nucleic acid sequences only provides the experimenter with a limited insight as to the population of codons and the corresponding encoded molecules.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Complex
  • The complex comprises an encoded molecule and an identifier oligonucleotide. The identifier comprises codons that identify the encoded molecule. Preferably, the identifier oligonucleotide identifies the encoded molecule uniquely, i.e. in a library of complexes a particular identifier is capable of distinguishing the molecule it is attached to from the rest of the molecules.
  • The encoded molecule and the identifier may be attached directly to each other or through a bridging moiety. In one aspect of the invention, the bridging moiety is a selectively cleavable linkage.
  • The identifier oligonucleotide may comprise two or more codons. In a preferred aspect the identifier oligonucleotide comprises three or more codons. The sequence of each codon can be decoded utilizing the present method to identify reactants used in the formation of the encoded molecule. When the identifier comprises more than one codon, each member of a pool of chemical entities can be identified and the order of codons is informative of the synthesis step each member has been incorporated in.
  • In a certain embodiment, the same codon is used to code for several different chemical entities. In a subsequent identification step, the structure of the encoded molecule can be deduced taking advantage of the knowledge of different attachment chemistries, steric hindrance, deprotection of orthogonal protection groups, etc. In another embodiment, the same codon is used for a group of chemical entities having a common property, such as a lipophilic nature, a certain attachment chemistry etc. In a preferred embodiment, however, the codon is unique i.e. a similar combination of nucleotides does not appear on the identifier oligonucleotide coding for another chemical entity. In a practical approach, for a specific chemical entity, only a single combination of nucleotides is used. In some aspects of the invention, it may be advantageous to use several codons for the same chemical entity, much in the same way as Nature uses up to six different codons for a single amino acid. The two or more codons identifying the same chemical entity may carry further information related to different reaction conditions.
  • The sequence of the nucleotides in each codon may have any suitable length. The codon may be a single nucleotide or a plurality of nucleotides. In some aspects of the invention, it is preferred that each codon independently comprises four or more nucleotides, more preferred 4 to 30 nucleotides. In some aspects of the invention the lengths of the codons vary.
  • A certain codon may be distinguished from any other codon in the library by only a single nucleotide. However, to facilitate a subsequent decoding process and to increase the ability of the primer to discriminate between codons it is in general desired to have two or more mismatches between a particular codon and any other codon appearing on identifier oligonucleotide. As an example, if a codon length of 5 nucleotides is selected, more than 100 nucleotide combinations exist in which two or more mismatches appear. For a certain number of nucleotides in the codon, it is generally desired to optimize the number of mismatches between a particular codon relative to any other codon appearing in the library.
  • The identifier oligonucleotide will in general have at least two codons arranged in sequence, i.e. next to each other. Two neighbouring codons may be separated by a framing sequence. Depending on the encoded molecule formed, the identifier may comprise further codons, such as 3, 4, 5, or more codons. Each of the further codons may be separated by a suitable framing sequence. Preferably, all or at least a majority of the codons of the identifier are separated from a neighbouring codon by a framing sequence. The framing sequence may have any suitable number of nucleotides, e.g. 1 to 20. Alternatively, codons on the identifier may be designed with overlapping sequences.
  • The framing sequence, if present, may serve various purposes. In one setup of the invention, the framing sequence identifies the position of the codon. Usually, the framing sequence either upstream or downstream of a codon comprises information which positions the chemical entity and the reaction conditions in the synthesis history of the encoded molecule. The framing sequence may also or in addition provide for a region of high affinity. The high affinity region may ensure that a hybridisation event with an anti-codon will occur in frame. Moreover, the framing sequence may adjust the annealing temperature to a desired level.
  • A framing sequence with high affinity can be provided by incorporation of one or more nucleobases forming three hydrogen bonds to a cognate nucleobase. Examples of nucleobases having this property are guanine and cytosine. Alternatively, or in addition, the framing sequence may be subjected to backbone modification. Several back bone modifications provides for higher affinity, such as 2′-O-methyl substitution of the ribose moiety, peptide nucleic acids (PNA), and 2′-O-methylene cyclisation of the ribose moiety, also referred to as LNA (Locked Nucleic Acid).
  • The sequence comprising a codon and an adjacent framing sequence has in a certain aspect of the invention a total length of 11 nucleotides or more, preferably 15 nucleotides or more. A primer may be designed to complementary to the codon sequence as well as the framing sequence. The presence of an extension reaction under conditions allowing for such reaction to occur is indicative of the presence of the chemical entity encoded in the codon as well as the position said chemical entity has in the entire synthesis history of the encoded molecule.
  • The identifier may comprise flanking regions around the coding section. The flanking regions can also serve as priming sites for amplification reactions, such as PCR or as binding region for oligonucleotide probe. The identifier may in certain embodiments comprise an affinity region having the property of being able to hybridise to a building block.
  • It is to be understood that when the term identifier oligonucleotide is used in the present description and claims, the identifier oligonucleotide may be in the sense or the anti-sense format, i.e. the identifier can be a sequence of codons which actually codes for the encoded molecule or can be a sequence complementary thereto. Moreover, the identifier may be single-stranded or double-stranded, as appropriate.
  • The encoded molecule part of the complex is generally of a structure expected of having an effect according to the property sought for, e.g. the encoded molecule has a binding affinity towards a target. When the target is of pharmaceutical importance, the encoded molecule is generally a possible drug candidate. The complex may be formed by tagging a library of different possible drug candidates with a tag, e.g. a nucleic acid tag identifying each possible drug candidate. In another embodiment of the invention, the molecule formed by a variety of reactants which have reacted with each other and/or a scaffold molecule. Optionally, this reaction product may be post-modified to obtain the final molecule displayed on the complex. The post-modification may involve the cleavage of one or more chemical bonds attaching the encoded molecule to the identifier in order more efficiently to display the encoded molecule.
  • The formation of an encoded molecule generally starts by a scaffold, i.e. a chemical unit having one or more reactive groups capable of forming a connection to another reactive group positioned on a chemical entity, thereby generating an addition to the original scaffold. A second chemical entity may react with a reactive group also appearing on the original scaffold or a reactive group incorporated by the first chemical entity. Further chemical entities may be involved in the formation of the final reaction product. The formation of a connection between the chemical entity and the nascent encoded molecule may be mediated by a bridging molecule. As an example, if the nascent encoded molecule and the chemical entity both comprise an amine group a connection between these can be mediated by a dicarboxylic acid. A synthetic molecule is in general produced in vitro and may be a naturally occurring or an artificial substance. Usually, a synthetic molecule is not produced using the naturally translation system in an in vitro process.
  • The chemical entities that are precursors for structural additions or eliminations of the encoded molecule may be attached to a building block prior to the participation in the formation of the reaction product leading the final encoded molecule. Besides the chemical entity, the building block generally comprises an anti-codon. In some embodiments the building blocks also comprise an affinity region providing for affinity towards the nascent complex.
  • Thus, the chemical entities are suitably mediated to the nascent encoded molecule by a building block, which further comprises an anticodon. The anti-codon serves the function of transferring the genetic information of the building block in conjunction with the transfer of a chemical entity. The transfer of genetic information and chemical entity may occur in any order. The chemical entities are preferably reacted without enzymatic interaction in some aspects of the invention. Notably, the reaction of the chemical entities is preferably not mediated by ribosomes or enzymes having similar activity. In other aspects of the invention, enzymes are used to mediate the reaction between a chemical entity and a nascent encoded molecule.
  • According to certain aspects of the invention the genetic information of the anti-codon is transferred by specific hybridisation to a codon on a nucleic acid identifier. Another method for transferring the genetic information of the anti-codon to the nascent complex is to anneal an oligonucleotide complementary to the anti-codon and attach this oligonucleotide to the complex, e.g. by ligation. A still further method involves transferring the genetic information of the anti-codon to the nascent complex by an extension reaction using a polymerase and a mixture of dNTPs.
  • The chemical entity of the building block may in most cases be regarded as a precursor for the structural entity eventually incorporated into the encoded molecule. In other cases the chemical entity provides for the eliminations of chemical units of the nascent encoded molecule. Therefore, when it in the present application with claims is stated that a chemical entity is transferred to a nascent encoded molecule it is to be understood that not necessarily all the atoms of the original chemical entity is to be found in the eventually formed encoded molecule. Also, as a consequence of the reactions involved in the connection, the structure of the chemical entity can be changed when it appears on the nascent encoded molecule. Especially, the cleavage resulting in the release of the entity may generate a reactive group which in a subsequent step can participate in the formation of a connection between a nascent complex and a chemical entity.
  • The chemical entity of the building block comprises at least one reactive group capable of participating in a reaction which results in a connection between the chemical entity of the building block and another chemical entity or a scaffold associated with the nascent complex. The number of reactive groups which appear on the chemical entity is suitably one to ten. A building block featuring only one reactive group is used i.a. in the end positions of polymers or scaffolds, whereas building blocks having two reactive groups are suitable for the formation of the body part of a polymer or scaffolds capable of being reacted further. One, two or more reactive groups intended for the formation of connections, are typically present on scaffolds. Non-limiting examples of scaffolds are opiates, steroids, benzodiazepines, hydantoines, and peptidylphosphonates.
  • The reactive group of the chemical entity may be capable of forming a direct connection to a reactive group of the nascent complex or the reactive group of the building block may be capable of forming a connection to a reactive group of the nascent complex through a bridging fill-in group. It is to be understood that not all the atoms of a reactive group are necessarily maintained in the connection formed. Rather, the reactive groups are to be regarded as precursors for the structure of the connection.
  • The subsequent cleavage step to release the chemical entity from the building block can be performed in any appropriate way. In an aspect of the invention the cleavage involves usage of a chemical reagent or an enzyme. The cleavage results in a transfer of the chemical entity to the nascent encoded molecule or in a transfer of the nascent encoded molecule to the chemical entity of the building block. In some cases it may be advantageous to introduce new chemical groups as a consequence of linker cleavage. The new chemical groups may be used for further reaction in a subsequent cycle, either directly or after having been activated. In other cases it is desirable that no trace of the linker remains after the cleavage.
  • In another aspect, the connection and the cleavage is conducted as a simultaneous reaction, i.e. either the chemical entity of the building block or the nascent encoded molecule is a leaving group of the reaction. In some aspects of the invention, it is appropriate to design the system such that the connection and the cleavage occur simultaneously because this will reduce the number of steps and the complexity. The simultaneous connection and cleavage can also be designed such that either no trace of the linker remains or such that a new chemical group for further reaction is introduced, as described above.
  • The attachment of the chemical entity to the building block, optionally via a suitable spacer can be at any entity available for attachment, e.g. the chemical entity can be attached to a nucleobase or the backbone. In general, it is preferred to attach the chemical entity at the phosphor of the internucleoside linkage or at the nucleobase. When the nucleobase is used for attachment of the chemical entity, the attachment point is usually at the 7 position of the purines or 7-deaza-purins or at the 5 position of pyrimidines. The nucleotide may be distanced from the reactive group of the chemical entity by a spacer moiety. The spacer may be designed such that the conformational spaced sampled by the reactive group is optimized for a reaction with the reactive group of the nascent encoded molecule.
  • The encoded molecules may have any chemical structure. In a preferred aspect, the encoded molecule can be any compound that may be synthesized in a component-by-component fashion. In some aspects the synthetic molecule is a linear or branched polymer. In another aspect the synthetic molecule is a scaffolded molecule. The term “encoded molecule” also comprises naturally occurring molecules like α-polypeptides etc, however produced in vitro usually in the absence of enzymes, like ribosomes. In certain aspects, the synthetic molecule of the library is a non-α-polypeptide.
  • The encoded molecule may have any molecular weight. However, in order to be orally available, it is in this case preferred that the synthetic molecule has a molecular weight less than 2000 Daltons, preferably less than 1000 Dalton, and more preferred less than 500 Daltons.
  • The size of the library may vary considerably pending on the expected result of the inventive method. In some aspects, it may be sufficient that the library comprises two, three, or four different complexes. However, in most events, more than two different complexes are desired to obtain a higher diversity. In some aspects, the library comprises 1,000 or more different complexes, more preferred 1,000,000 or more different complexes. The upper limit for the size of the library is only restricted by the size of the vessel in which the library is comprised. It may be calculated that a vial may comprise up to 1014 different complexes.
  • Methods for Forming Libraries of Complexes
  • The encoded molecules associated with an identifier oligonucleotide having two or more codons that code for reactants that have reacted in the formation of the molecule part of the complex may be formed by a variety of processes. Generally, the preferred methods can be used for the formation of virtually any kind of encode molecule. Suitable examples of processes include prior art methods disclosed in WO 93/20242, WO 93106121, WO 00/23458, WO 02/074929, and WO 02/103008, the content of which being incorporated herein by reference as well as methods of the present applicant not yet public available, including the methods disclosed in PCT/DK03/00739 filed 30 Oct. 2003, and DK PA 2003 00430 filed 20 Mar. 2003. Any of these methods may be used, and the entire content of the patent applications are included herein by reference.
  • Below five presently preferred embodiments are described. A first embodiment disclosed in more detail in WO 02/103008 is based on the use of a polymerase to incorporate unnatural nucleotides as building blocks. Initially, a plurality of identifier oligonucleotides is provided. Subsequently primers are annealed to each of the identifiers and a polymerase is extending the primer using nucleotide derivatives, which have appended chemical entities. Subsequent to or simultaneously with the incorporation of the nucleotide derivatives, the chemical entities are reacted to form a reaction product. The encoded molecule may be post-modified by cleaving some of the linking moieties to better present the encoded molecule.
  • Several possible reaction approaches for the chemical entities are apparent. First, the nucleotide derivatives can be incorporated and the chemical entities subsequently polymerised. In the event the chemical entities each carry two reactive groups, the chemical entities can be attached to adjacent chemical entities by a reaction of these reactive groups. Exemplary of the reactive groups are amine and carboxylic acid, which upon reaction form an amide bond. Adjacent chemical entities can also be linked together using a linking or bridging moiety. Exemplary of this approach is the linking of two chemical entities each bearing an amine group by a bi-carboxylic acid. Yet another approach is the use of a reactive group between a chemical entity and the nucleotide building block, such as an ester or a hoister group. An adjacent building block having a reactive group such as an amine may cleave the interspaced reactive group to obtain a linkage to the chemical entity, e.g. by an amide linking group.
  • A second embodiment for obtainment of complexes disclosed in WO 02/103008 pertains to the use of hybridisation of building blocks to an identifier and reaction of chemical entities attached to the building blocks in order to obtain a reaction product. This approach comprises that identifiers are contacted with a plurality of building blocks, wherein each building block comprises an anti-codon and a chemical entity. The anti-codons are designed such that they recognise a sequence, i.e. a codon, on the identifier. Subsequent to the annealing of the anti-codon and the codon to each other a reaction of the chemical entity is effected.
  • The identifier may be associated with a scaffold. Building blocks bringing chemical entities in may be added sequentially or simultaneously and a reaction of the reactive group of the chemical entity may be effected at any time after the annealing of the building blocks to the identifier.
  • A third embodiment for the generation of a complex includes chemical or enzymatic ligation of building blocks when these are lined up on a identifier. Initially, identifiers are provided, each having one or more codons. The identifiers are contacted with building blocks comprising anti-codons linked to chemical entities. The two or more anti-codons annealed on an identifier are subsequently ligated to each other and a reaction of the chemical entities is effected to obtain a reaction product. The method is disclosed in more detail in DK PA 2003 00430 filed 20 Mar. 2003.
  • A fourth embodiment makes use of the extension by a polymerase of an affinity sequence of the nascent complex to transfer the anti-codon of a building block to the nascent complex. The method implies that a nascent complex comprising a scaffold and an affinity region is annealed to a building block comprising a region complementary to the affinity section. Subsequently, the anti-codon region of the building block is transferred to the nascent complex by a polymerase. The transfer of the chemical entity may be transferred prior to, simultaneously with or subsequent to the transfer of the anti-codon. This method is disclosed in detail in PCT/DK03100739.
  • A fifths embodiment also disclosed in PCT/DK03/00739 comprises reaction of a reactant with a reaction site on nascent bifunctional molecule and addition of a nucleic acid tag to the nascent bifunctional molecule using an enzyme, such as a ligase. When a library is formed, usually an array of compartments is used for reaction of reactants and enzymatic addition of tags with the nascent bifunctional molecule.
  • Thus, the codons are either pre-made into one or more identifiers before the encoded molecules are generated or the codons are transferred simultaneously with the formation of the encoded molecules.
  • After or simultaneously with the formation of the reaction product some of the linkers to the identifier may be cleaved, however, usually at least one linker is maintained to provide for the complex.
  • Nucleotides
  • The nucleotides used in the present invention may be linked together in a sequence of nucleotides, i.e. an oligonucleotide. Each nucleotide monomer is normally composed of two parts, namely a nucleobase moiety, and a backbone. The backbone may in some cases be subdivided into a sugar moiety and an internucleoside linker.
  • The nucleobase moiety may be selected among naturally occurring nucleobases as well as non-naturally occurring nucleobases. Thus, “nucleobase” includes not only the known purine and pyrimidine heterocycles, but also heterocyclic analogues and tautomers thereof. Illustrative examples of nucleobases are adenine, guanine, thymine, cytosine, uracil, purine, xanthine, diaminopurine, 8-oxo-N6-methyladenine, 7-deazaxanthine, 7-deazaguanine, N4,N4-ethanocytosin, N6,N6-ethano-2,6-diamino-purine, 5-methylcytosine, 5-(C3-C6)-alkynylcytosine, 5-fluorouracil, 5-bromouracil, pseudoisocytosine, 2-hydroxy-5-methyl-4-triazolopyridine, isocytosine, isoguanine, inosine and the “non-naturally occurring” nucleobases described in Benner et al., U.S. Pat. No. 5,432,272. The term “nucleobase” is intended to cover these examples as well as analogues and tautomers thereof. Especially interesting nucleobases are adenine, guanine, thymine, cytosine, 5-methylcytosine, and uracil, which are considered as the naturally occurring nucleobases in relation to therapeutic and diagnostic application in humans.
  • Examples of suitable specific pairs of nucleobases are shown below:
    Figure US20070026397A1-20070201-C00001
    Figure US20070026397A1-20070201-C00002
  • Suitable examples of backbone units are shown below (B denotes a nucleobase):
    Figure US20070026397A1-20070201-C00003
    Figure US20070026397A1-20070201-C00004
  • The sugar moiety of the backbone is suitably a pentose but may be the appropriate part of a PNA or a six-member ring. Suitable examples of possible pentoses include ribose, 2′-deoxyribose, 2′-O-methyl-ribose, 2′-flour-ribose, and 2′,4′-O-methylene-ribose (LNA). Suitably the nucleobase is attached to the 1′ position of the pentose entity.
  • An internucleoside linker connects the 3′ end of preceding monomer to a 5′ end of a succeeding monomer when the sugar moiety of the backbone is a pentose, like ribose or 2-deoxyribose. The internucleoside linkage may be the natural occurring phospodiester linkage or a derivative thereof. Examples of such derivatives include phosphorothioate, methylphosphonate, phosphoramidate, phosphotriester, and phosphodithioate. Furthermore, the internucleoside linker can be any of a number of non-phosphorous-containing linkers known in the art.
  • Preferred nucleic acid monomers include naturally occurring nucleosides forming part of the DNA as well as the RNA family connected through phosphodiester linkages. The members of the DNA family include deoxyadenosine, deoxyguanosine, deoxythymidine, and deoxycytidine. The members of the RNA family include adenosine, guanosine, uridine, cytidine, and inosine. Inosine is a non-specific pairing nucleoside and may be used as universal base because inosine can pair nearly isoenergetically with A, T, and C. Other compounds having the same ability of non-specifically base-pairing with natural nucleobases have been formed. Suitable compounds which may be utilized in the present invention includes among others the compounds depicted below
  • EXAMPLES OF UNIVERSAL BASES
  • Figure US20070026397A1-20070201-C00005
  • Building Block
  • The chemical entities or reactants that are precursors for structural additions or eliminations of the encoded molecule may be attached to a building block prior to the participation in the formation of the reaction product leading to the final encoded molecule. Besides the chemical entity, the building block generally comprises an anti-codon.
  • The chemical entity of the building block comprises at least one reactive group capable of participating in a reaction, which results in a connection between the chemical entity of the building block and another chemical entity or a scaffold associated with the nascent complex. The connection is facilitated by one or more reactive groups of the chemical entity. The number of reactive groups, which appear on the chemical entity, is suitably one to ten. A building block featuring only one reactive group is used i.a. in the end positions of polymers or scaffolds, whereas building blocks having two reactive groups are suitable for the formation of the body part of a polymer or scaffolds capable of being reacted further. One, two or more reactive groups intended for the formation of connections are typically present on scaffolds.
  • The reactive group of the building block may be capable of forming a direct connection to a reactive group of the nascent complex or the reactive group of the building block may be capable of forming a connection to a reactive group of the nascent complex through a bridging fill-in group. It is to be understood that not all the atoms of a reactive group are necessarily maintained in the connection formed. Rather, the reactive groups are to be regarded as precursors for the structure of the connection.
  • The subsequent cleavage step to release the chemical entity from the building block can be performed in any appropriate way. In an aspect of the invention the cleavage involves usage of a reagent or an enzyme. The cleavage results in a transfer of the chemical entity to the nascent encoded molecule or in a transfer of the nascent encoded molecule to the chemical entity of the building block. In some cases it may be advantageous to introduce new chemical groups as a consequence of linker cleavage. The new chemical groups may be used for further reaction in a subsequent cycle, either directly or after having been activated. In other cases it is desirable that no trace of the linker remains after the cleavage.
  • In another aspect, the connection and the cleavage are conducted as a simultaneous reaction, i.e. either the chemical entity of the building block or the nascent encoded molecule is a leaving group of the reaction. In general, it is preferred to design the system such that the connection and the cleavage occur simultaneously because this will reduce the number of steps and the complexity. The simultaneous connection and cleavage can also be designed such that either no trace of the linker remains or such that a new chemical group for further reaction is introduced, as described above.
  • The attachment of the chemical entity to the building block, optionally via a suitable spacer can be at any entity available for attachment, e.g. the chemical entity can be attached to a nucleobase or the backbone. In general, it is preferred to attach the chemical entity at the phosphor of the internucleoside linkage or at the nucleobase. When the nucleobase is used for attachment of the chemical entity, the attachment point is usually at the 7 position of the purines or 7-deaza-purins or at the 5 position of pyrimidines. The. nucleotide may be distanced from the reactive group of the chemical entity by a spacer moiety. The spacer may be designed such that the conformational space sampled by the reactive group is optimized for a reaction with the reactive group of the nascent encoded molecule or reactive site.
  • The anticodon complements the codon of the identifier oligonucleotide sequence and generally comprises the same number of nucleotides as the codon. The anti-codon may be adjoined with a fixed sequence, such as a sequence complementing a framing sequence.
  • Various specific building blocks are envisaged. Building blocks of particular interest are shown below.
  • Building Blocks Transferring a Chemical Entity to a Recipient Nucleophilic Group
  • The building block indicated below is capable of transferring a chemical entity (CE) to a recipient nucleophilic group, typically an amine group. The bold lower horizontal line illustrates the building block comprising an anti-codon and the vertical line illustrates a spacer. The 5membered substituted N-hydroxysuccinimid (NHS) ring serves as an activator, i.e. a labile bond is formed between the oxygen atom connected to the NHS ring and the chemical entity. The labile bond may be cleaved by a nucleophilic group, e.g. positioned on a scaffold
    Figure US20070026397A1-20070201-C00006
  • The 5-membered substituted N-hydroxysuccinimid (NHS) ring serves as an activator, i.e. a labile bond is formed between the oxygen atom connected to the NHS ring and the chemical entity. The labile bond may be cleaved by a nucleophilic group, e.g. positioned on a scaffold, to transfer the chemical entity to the scaffold, thus converting the remainder of the fragment into a leaving group of the reaction. When the chemical entity is connected to the activator through a carbonyl group and the recipient group is an amine, the bond formed on the scaffold will an amide bond. The above building block is the subject of WO03078627A2, the content of which is incorporated herein in their entirety by reference.
  • Another building block, which may form an amide bond, is
    Figure US20070026397A1-20070201-C00007
  • R may be absent or NO2, CF3, halogen, preferably Cl, Br, or I, and Z may be S or O. This type of building block is disclosed in WO03078626A2. The content of this patent application is incorporated herein in the entirety by reference.
  • A nucleophilic group can cleave the linkage between Z and the carbonyl group thereby transferring the chemical entity —(C═O)—CE′ to said nucleophilic group.
  • Building Blocks Transferring a Chemical Entity to a Recipient Reactive Group Forming a C═C Bond
  • A building block as shown below is able to transfer the chemical entity to a recipient aldehylde group thereby forming a double bond between the carbon of the aldehyde and the chemical entity
    Figure US20070026397A1-20070201-C00008
  • The above building block is disclosed in WO03078445A2, the content of which being incorporated herein in the entirety by reference.
  • Building Blocks Transferring a Chemical Entity to a Recipient Reactive Group Forming a C—C Bond
  • The below building block is able to transfer the chemical entity to a recipient group thereby forming a single bond between the receiving moiety, e.g. a scaffold, and the chemical entity.
    Figure US20070026397A1-20070201-C00009
  • The above building block is disclosed in WO03078445A2, the content of which being incorporated herein in the entirety by reference.
  • Another building block capable of transferring a chemical entity to a receiving reactive group forming a single bond is
    Figure US20070026397A1-20070201-C00010
  • The receiving group may be a nucleophile, such as a group comprising a hetero atom, thereby forming a single bond between the chemical entity and the hetero atom, or the receiving group may be an electronegative carbon atom, thereby forming a C—C bond between the chemical entity and the scaffold. The above building block is disclosed in WO03078446A2, the content of which is incorporated herein by reference.
  • The chemical entity attached to any of the above building blocks may be a selected from a large arsenal of chemical structures. Examples of chemical entities are H or entities selected among the group consisting of a C1-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C4-C8 alkadienyl, C3-C7 cycloalkyl, C3-C7 cycloheteroalkyl, aryl, and heteroaryl, said group being substituted with 0-3 R4, 0-3 R5 and 0-3 R9 or C1-C3 alkylene-NR4 2, C1-C3 alkylene-NR4C(O)R8, C1-C3 alkylene-NR4C(O)OR8, C1-C2 alkylene-O—NR4 2, C1-C2 alkylene-O—NR4C(O)R8, C1-C2 alkylene-O—NR4C(O)OR8 substituted with 0-3 R9.
      • where R4 is H or selected independently among the group consisting of C1-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 cycloalkyl, C3-C7 cycloheteroalkyl, aryl, heteroaryl, said group being substituted with 0-3 R9 and
      • R5 is selected independently from —N3, —CNO, —C(NOH)NH2, —NHOH, —NHNHR6, —C(O)R6, —SnR6 3, —B(OR6)2, —P(O)(OR6)2 or the group consisting of C2-C6 alkenyl, C2-C6 alkynyl, C4-C8 alkadienyl said group being substituted with 0-2 R7,
      • where R6 is selected independently from H, C1-C6 alkyl, C3-C7 cycloalkyl, aryl or C1-C6 alkylene-aryl substituted with 0-5 halogen atoms selected from —F, —Cl, —Br, and —I; and
      • R7 is independently selected from —NO2, —COOR6, —COR6, —CN, —OSiR6 3, —OR6 and —NR6 2.
      • R8 is H, C1-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 cycloalkyl, aryl or C1-C6 alkylene-aryl substituted with 0-3 substituents independently selected from —F, —Cl, —NO2, —R3, —OR3, —SiR3 3
      • R9 is ═O, —F, —Cl, —Br, —I, —CN, —NO2, —OR6, —NR6 2, —NR6—C(O)R8, —NR6—C(O)OR8, —SR6, —S(O)R6, —S(O)2R6, —COOR6, —C(O)NR6 2 and —S(O)2NR6 2.
  • Cross-Link Cleavage Building Blocks
  • It may be advantageous to split the transfer of a chemical entity to a recipient reactive group into two separate steps, namely a cross-linking step and a cleavage step because each step can be optimized. A suitable building block for this two-step process is illustrated below:
    Figure US20070026397A1-20070201-C00011
  • Initially, a reactive group appearing on the chemical entity precursor (abbreviated FEP) reacts with a recipient reactive group, e.g. a reactive group appearing on a scaffold, thereby forming a cross-link. Subsequently, a cleavage is performed, usually by adding an aqueous oxidising agent such as I2, Br2, Cl2, H+, or a Lewis acid. The cleavage results in a transfer of the group HZ-FEP- to the recipient moiety, such as a scaffold.
  • In the above formula
      • Z is O, S, NR4
      • Q is N, CR1
      • P is a valence bond, O, S, NR4, or a group C5-7arylene, C1-6alkylene, C1-6O-alkylene, C1-6S-alkylene, NR1-alkylene, C1-6alkylene-O, C1-6alkylene-S option said group being substituted with 0-3 R4, 0-3 R5 and 0-3 R9or C1-C3 alkylene-NR4 2, C1-C3 alkylene-NR4C(O)R8, C1-C3 alkylene-NR4C(O)OR8, C1-C2 alkylene-O—NR4 2, C1-C2 alkylene-O—NR4C(O)R8, C1-C2 alkylene-O—NR4C(O)OR8 substituted with 0-3 R9,
      • B is a group comprising D-E-F, in which
      • D is a valence bond or a group C1-6alkylene, C1-6alkenylene, C1-6alkynylene, C5-7arylene, or C5-7heteroarylene, said group optionally being substituted with 1 to 4 group R11,
      • E is, when present, a valence bond, O, S, NR4, or a group C1-6alkylene, C1-6alkenylene, C1-6alkynylene, C5-7arylene, or C5-7heteroarylene, said group optionally being substituted with 1 to 4 group R11,
      • F is, when present, a valence bond, O, S, or NR4,
      • A is a spacing group distancing the chemical structure from the complementing element, which may be a nucleic acid,
      • R1, R2, and R3 are independent of each other selected among the group consisting of H, C1-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C4-C8 alkadienyl, C3-C7 cycloalkyl, C3-C7 cycloheteroalkyl, aryl, and heteroaryl, said group being substituted with 0-3 R4, 0-3 R5 and 0-3 R9 or C1-C3 alkylene-NR4 2, C1-C3 alkylene-NR4C(O)R8, C1-C3 alkylene-NR4C(O)OR8, C1-C2 alkylene-O—NR4 2, C1-C2 alkylene-O—NR4C(O)R8, C1-C2 alkylene-O—NR4C(O)OR8 substituted with 0-3 R9,
      • FEP is a group selected among the group consisting of H, C1-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C4-C8 alkadienyl, C3-C7 cycloalkyl, C3-C7 cycloheteroalkyl, aryl, and heteroaryl, said group being substituted with 0-3 R4, 0-3 R5 and 0-3 R9 or C1-C3 alkylene-NR4 2, C1-C3 alkylene-NR4C(O)R8, C1-C3 alkylene-NR4C(O)OR8, C1-C2 alkylene-O—NR4 2, C1-C2 alkylene-O—NR4C(O)R8, C1-C2 alkylene-O—NR4C(O)OR8 substituted with 0-3 R9,
      • where R4 is H or selected independently among the group consisting of C2-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 cycloalkyl, C3-C7 cycloheteroalkyl, aryl, heteroaryl, said group being substituted with 0-3 R9 and
      • R5 is selected independently from —N3, —CNO, —C(NOH)NH2, —NHOH, —NHNHR6, —C(O)R6, —SnR6 3, —B(OR6)2, —P(O)(OR6)2 or the group consisting of C2-C6 alkenyl, C2-C6 alkynyl, C4-C8 alkadienyl said group being substituted with 0-2 R7,
      • where R6 is selected independently from H, C1-C6 alkyl, C3-C7 cycloalkyl, aryl or C1-C6 alkylene-aryl substituted with 0-5 halogen atoms selected from —F, —Cl, —Br, and —I; and R7 is independently selected from —NO2, —COOR6, —COR6, —CN, —OSiR6 3, —OR6 and —NR6 2.
  • R8 is H, C1-C6 alkyl, C2-C6 alkenyl, C2-C6 alkynyl, C3-C7 cycloalkyl, aryl or C1-C6 alkylene-aryl substituted with 0-3 substituents independently selected from —F, —Cl, —NO2, —R3, —OR3, —SiR3 3
  • R9 is ═O, —F, —Cl, —Br, —I, —CN, —NO2, —OR6, —NR6 2, —NR6—C(O)R8, —NR6—C(O)OR8, —SR6, —S(O)R6, —S(O)2R6, —COOR6, —C(O)NR6 2 and —S(O)2NR6 2.
  • In a preferred embodiment Z is O or S, P is a valence bond, Q is CH, B is CH2, and R1, R2, and R3 is H. The bond between the carbonyl group and Z is cleavable with aqueous I2.
  • Partitioning Conditions
  • The partition step may be referred to as a selection or a screen, as appropriate, and includes the screening of the library for encoded molecules having predetermined desirable characteristics. Predetermined desirable characteristics can include binding to a target, catalytically changing the target, chemically reacting with a target in a manner which alters/modifies the target or the functional activity of the target, and covalently attaching to the target as in a suicide inhibitor.
  • The target can be any compound of interest. E.g. the target can be a protein, peptide, carbohydrate, polysaccharide, glycoprotein, hormone, receptor, antigen, antibody, virus, substrate, metabolite, transition state analogue, cofactor, inhibitor, drug, dye, nutrient, growth factor, cell, tissue, etc. without limitation. Particularly preferred targets include, but are not limited to, angiotensin converting enzyme, renin, cyclooxygenase, 5-lipoxygenase, IIL-10 converting enzyme, cytokine receptors, PDGF receptor, type II inosine monophosphate dehydrogenase, β-lactamases, integrin, and fungal cytochrome P-450. Targets can include, but are not limited to, bradykinin, neutrophil elastase, the HIV proteins, including tat, rev, gag, int, RT, nucleocapsid etc., VEGF, bFGF, TGFβ, KGF, PDGF, thrombin, theophylline, caffeine, substance P, IgE, sPLA2, red blood cells, glioblastomas, fibrin clots, PBMCs, hCG, lectins, selectins, cytokines, ICP4, complement proteins, etc.
  • Encoded molecules having predetermined desirable characteristics can be partitioned away from the rest of the library while still attached to the identifier nucleic acid sequence by various methods known to one of ordinary skill in the art. In one embodiment of the invention the desirable products are partitioned away from the entire library without chemical degradation of the attached nucleic acid identifier such that the identifiers are amplifiable. The identifiers may then be amplified, either still attached to the desirable encoded molecule or after separation from the desirable encoded molecule.
  • In a preferred embodiment, the desirable encoded molecule acts on the target without any interaction between the nucleic acid attached to the desirable encoded molecule and the target. In one embodiment, the bound complex-target aggregate can be partitioned from unbound complexes by a number of methods. The methods include nitrocellulose filter binding, column chromatography, filtration, affinity chromatography, centrifugation, and other well known methods.
  • Briefly, the library of complexes is subjected to the partitioning step, which may include contact between the library and a column onto which the target is immobilised. Identifier nucleic acids associated with undesirable encoded molecules, i.e. encoded molecules not bound to the target under the stringency conditions used, will pass through the column. Additional undesirable encoded molecules (e.g. encoded molecules which cross-react with other targets) may be removed by counter-selection methods. Desirable complexes are bound to the column and can be eluted by changing the conditions of the column (e.g., salt, pH, surfactant, etc.) or the identifier.
  • Additionally, encoded molecules which react with a target can be separated from those products that do not react with the target. In one example, a chemical compound which covalently attaches to the target (such as a suicide inhibitor) can be washed under very stringent conditions. The resulting complex can then be treated with proteinase, DNAse or other suitable reagents to cleave a linker and liberate the nucleic acids which are associated with the desirable chemical compound. The liberated nucleic acids can be amplified.
  • In another example, the predetermined characteristic of the desirable product is the ability of the product to transfer a chemical group (such as acyl transfer) to the target and thereby inactivate the target. One could have a product library where all of the products have a thioester chemical group. Upon contact with the target, the desirable products will transfer the chemical group to the target concomitantly changing the desirable product from a thioester to a thiol. Therefore, a partitioning method which would identify products that are now thiols (rather than thioesters) will enable the selection of the desirable products and amplification of the nucleic acid associated therewith.
  • There are other partitioning and screening processes, which are compatible with this invention that are known to one of ordinary skill in the art. In one embodiment, the products can be fractionated by a number of common methods and then each fraction is then assayed for activity. The fractionization methods can include size, pH, hydrophobicity, etc.
  • Inherent in the present method is the selection of encoded molecules on the basis of a desired function; this can be extended to the selection of molecules with a desired function and specificity. Specificity can be required during the selection process by first extracting identifier nucleic acid sequences of chemical compounds which are capable of interacting with a non-desired “target” (negative selection, or counter-selection), followed by positive selection with the desired target. As an example, inhibitors of fungal cytochrome P450 are known to cross-react to some extent with mammalian cytochrome P450 (resulting in serious side effects). Highly specific inhibitors of the fungal cytochrome could be selected from a library by first removing those products capable of interacting with the mammalian cytochrome, followed by retention of the remaining products which are capable of interacting with the fungal cytochrome.
  • BRIEF DESCRIPTION OF THE FIGURES
  • FIG. 1 illustrates the overall process of building block evolution.
  • FIG. 2 shows the distribution of codon in different positions in an output from a selection.
  • FIG. 3 shows the difference between identifier driven and building block driven evolution.
  • FIG. 4 shows a method for reducing the library diversity through codon analysis.
  • FIG. 5 discloses two embodiments of using a Taqman probe (5′ nuclease probe) in the measurement of the presence or absence of a certain codon.
  • FIG. 6 shows a standard curve referred to in example 4.
  • FIG. 7 shows a result of example 4.
  • FIG. 8 discloses a result of example 4.
  • FIG. 9 discloses a scheme relating to combined structural information and codon abundances in library design.
  • FIG. 10 discloses a relationship between codon analysis and structural information.
  • FIG. 11 shows the detection of single codons of identifiers.
  • FIG. 12 shows the detection of codon pairs of identifiers.
  • FIG. 13 shows the detection of codon pairs at specific codon positions.
  • FIG. 14 shows the detection of single codons of identifiers after the separation of the individual codons.
  • FIG. 15 discloses a method for selecting from a library, complexes capable of binding to a target molecule.
  • FIG. 16 discloses a method for enriching specific nucleic acid fragments and the utility of these fragments for the generation of a new library.
  • FIG. 17 discloses a method for reducing the diversity of a library of complexes.
  • DETAILED DESCRIPTION OF THE FIGURES
  • FIG. 1A Shows the principle steps in BB evolution. An initial library of desired size is produced. This initial library is subjected to a selection process where encoded molecules that associate with a target of interest are enriched. The encoding identifier oligonucleotide is preferably amplified and the used in the codon analysis step. This step monitors the relative abundance of each codon in the selected library. The information obtained in this analysis is used to design a new enriched library, which contains the preferable chemical entities and their corresponding codons. This new library is then subjected to a new selection process to select for binders. This diversity reduction cycle can be repeated until the desirable result is obtained and the binders have been obtained.
  • FIG. 1B shows how the diversity of a library (n4) is reduced by reducing the number of chemical entities (n) in the library. Thus, by removing chemical entities not involved in the encoded molecules partitioned, a reduction in library diversity can be obtained to allow the identification of binders.
  • The identifier oligonucleotide that encodes for the display molecule is composed of codons and associated with the encoded molecule, as shown in FIG. 2. These codons possess information about the chemical entities in the encoded molecule. Each of these codon positions can be analysed for the precise sequence, which will reflect which chemical entities that have been enrich for in the selection process. The relative amount can also be obtained by comparing the signal in the measuring procedure (e.g. QPCR and array analysis). Each codon position will have its own fingerprint on which chemical entities that the selected display molecules possess. These fingerprints in each position can subsequently be used to put together a new more focused library with a lower and more enrich diversity that can be subjected to another round of selection. This can then repeated until the preferable encoded molecules have been obtained.
  • FIG. 3 illustrates the main difference between identifier and chemical entity (CE) evolution. In both cases the initial selection starts on a library with certain diversity. After the first round of selection the encoding identifiers are amplified where the distribution is maintained. This distribution is then transferred to the next generation which is used in a new selection. Thus, the strongest binders that were enriched in the first round of selection will be present at a relatively higher concentration compared to the weaker binders and the background. In the CE-driven evolution the codon analysis is used to design a new library. In this example, the new library is constructed to contain all the chemical entities that were identified as a positive signal in the analysis. In other words, all the chemical entities that were not detected through the codon analysis were excluded in the new library. The new library is designed to have an equal amount of each selected chemical entity, which will generate all the possible display molecules at the same concentration. This will allow all binders to compete at the same concentration and potentially retain a more diverse set of binders in each round of selection. This is especially important for small molecules here not only the affinity is of interest.
  • FIG. 4. This illustrates the process where the diversity is reduced through the codon analysis. An initial library of 1010 (e.g. 317*317*317*317) library members is subjected to a selection. The enrich identifier oligonucleotides are amplified and used in the codon analysis. The codon analysis result is used to design a new 107 (e.g. 57*57*57*57) library where the enriched chemical entities are included. This new library is the again subjected to a selection process. The identifier oligonucleotides are amplified and used for codon analysis. This new codon analysis results is again used to design a new 104 (e.g. 10*10*10*10) library where the enriched chemical entities are included. Finally a last selection step is performed in this reduced diversity library to identify the binders.
  • A preferred embodiment of the invention utilizing a universal Taqman probe is shown in FIG. 5. Four codons are shown (P1through P4; bold pattern) along with flanking regions (light pattern). A universal Taqman probe anneals to a region adjacent to the codon region, but within the amplicon defined by the universal PCR primers Pr.1 and Pr. 2. These primers could be the same as used for amplification of the identifier oligonucleotides encoding binders after an enrichment process on a specific target. However, are minimal length identifiers preferred during the encoding process, the region involved in Taqman probe annealing could be appended to the library identifier oligonucleotides by e.g. overlap PCR, ligation, or by employing a long downstream PCR primer containing the necessary sequences. The added length corresponding to the region necessary for annealing of the Taqman probe would be form 20 to 40 nts depending on the type of TaqMan probe and TA of the PCR primers. The Q-PCR reactions are preferably performed in a 96- or 384-well format on a real-time PCR thermocycling machine.
  • Panel A shows the detection of abundance of a specific codon sequence in position one. Similar primers are prepared for all codon sequences. For each codon sequence utilized to encode a specific BB in the library a Q-PCR reaction is performed with a primer oligonucleotide complementary to the codon sequence in question. A downstream universal reverse primer Pr. 2 is provided after the Taqman probe to provide for an exponential amplification of the PCR amplicon. The setup is most suited for cases where the codon constitutes a length corresponding to a length suitable for a PCR primer.
  • Panel B shows the detection of abundance of a specific codon sequence in a specific codon position using a primer, which is complementing a codon and a framing sequence. Similar primers are used for all the codons and framing sequences. For each codon sequence utilized to encode a specific BB at a specific codon position in the library a Q-PCR reaction is performed with an oligo complementary to the codon sequence in question as well as a short region up- or downstream of the codon region which ensures extension of the primer in a PCR reaction only when annealed to the codon sequence in that specific codon position. The number of specific primers and Q-PCR reactions needed to cover all codon sequences in all possible codon positions equals the number of codon sequences times the number of codon positions. Thus, monitoring the abundance of 96 different codon sequences in 4 different positions can be performed in a single run on four 96 wells micro titre plates (as shown in Panel B) or a single 384 well plate on a suitable instrument This architecture allows for the decoding of a 8,5×107 library of different encoded molecules.
  • Quantification is performed relative to the amount of full-length PCR product obtained in a parallel control reaction on the same input material performed with the two external PCR primers Pr.1+Pr. 2. Theoretically, a similar rate of accumulation of this control amplicon compared to the accumulation of a product utilizing a single codon+sequence specific primer would indicate a 100% dominance of this particular sequence in the position in question.
  • Although the setups shown in Panel A and B employ a Taqman probe strategy, other detection systems (SYBR green, Molecular Beacons etc.) could be utilized. In theory, multiplex reactions employing up to 4 different fluorofors in the same reaction could increase throughput correspondingly.
  • An example of how a deconvolution process of a library of encoded molecules occurs is described in the following. Imagine that at the end of a selection scheme a pool of 3 ligand families (and the corresponding coding identifiers) are dominating the population and present at approx. the same concentration. Three different chemical entities are present in the first position of the encoded compounds, and each of these chemical entities are present in combination with one unique chemical entity out of 3 different chemical entities in position P2. Only one chemical entity in position 3 gives rise to active binders, whereas any of a 20% subset of chemical entities (e.g. determined by charge, size or other characteristics) is present in position 4. The outcome of the initial codon profile analysis would be: 3 codon sequences are equally dominating in position P1, 3 other codon sequences in position P2, 1 unique codon sequence is dominant in P3 whereas somewhat similarly increased levels of 20% of the codon sequences (background levels of the remaining 80% sequences) are seen in P4. In such cases it could be relevant to use an iterative Q-PCR (“IQ-PCR”) strategy to perform a further deconvolution of a library after selection. Again with reference to the example above, by taking the PCR products from the 3 individual wells that contained primers giving the high yields in position P1, diluting the product appropriately and performing a second round of Q-PCR on each of these identifier oligonucleotides separately, it would be possible to deduce which codon sequence(s) is preferred in P2 when a given codon sequence is present in P1.
  • FIG. 9. This figure illustrates the possibility to combine structural information about the chemical entities and the relative abundance when designing a new more focused library. The structural information about the chemical entities can be used at least in two ways. First the similarities between the chemical entities in each position can be used to choose chemical entities to a new library. Secondly, the combination of the selected chemical entities can be analyzed to investigate possible pattern that generate potential ligands. This is especially useful if the binding site or the structure of a known ligand is known. Any type of structural analysis tool can be used that generate information about the structure of separate chemical entities or combination of chemical entities (the potential binders). By combining these three analysis approaches a more focused library can be generated that potentially will contain more specific binders compare to background binders. This new focused library can be used in another round of selection to reduce the diversity. This procedure can be repeated until the desired binders have been identified.
  • FIG. 10. This figure shows how the combination of codon analysis and structural information can generate valuable information. This invention allows the performance of structure activity relationship analysis (SAR) where the relative abundance in the codon analysis will represent the activity parameter (e.g. IC50 values) in the SAR measurements. Pharmacophore models can be generated, focused libraries can be designed, certain follow up chemistry can be used and information in the hit to lead process can be used.
  • FIG. 11 shows an array detection system in which a single codon is detected. Initially a library of selected complexes (29), i.e. complexes comprised of the initial library, which display a certain property, is provided as disclosed above. The initial library of complexes is prepared from e.g. 100 codons and identifiers having 4 codons in sequence, which theoretical gives a library of 108 complexes. The selected complexes are subjected to amplification to amplify the identifiers of the selected complexes and the amplification products are added to an array (30). The array (30) comprises probes (32) complementary to each of the codons of the identifiers (31). At hybridisation conditions the PCR products of the identifiers are annealed to the cognate probes of the array and in a suitable scanner the spatial position of the annealed probes are detected to elucidate the codons (33) of the identifier. The quantity of each codon may be measured to find codons abundant in more than one identifier and/or codons leading to encoded molecules with high affinity. The information may be used for decoding of the encoded molecule of the complexes displaying the desired property or the information may be used for selection of building blocks, which is to be added in a next round of library formation.
  • FIG. 12 discloses an array detection system for establishing codons pairs, i.e. codons in the vicinity of each other. Initially (as shown in this example) a library of complexes is prepared from 100 different codons deposited on an identifier in a sequence of four, making the total amount of combinations possible 108. The initial library is subjected to a condition in order to select a sub-library (29) displaying a desired property. The identifiers of the sub-library are amplified by a PCR reaction and the reaction product is added under hybridisation conditions to an array (34).
  • The array is designed with probes (35) capable of detecting two codons at a time. To cover all possible combinations of a library based on 100 different codons 104 probes are needed, which is practically feasible with the current technology. The detection of the codons may be conducted quantitatively, i.e. the relative abundance of each of the codon pairs may be determined. The detection on the array may be used to reconstruct the selected identifiers (36) as three overlapping codon pair detections depict the entire identifier. In the event the same codon pair appears on more than one identifier, the information on the relative abundance of each codon pair maybe used to decipher the sequence of codons of the selected identifiers as it can be assumed that each codon pair of the same identifier appears in the same amounts in the PCR products added to the array.
  • FIG. 13 discloses an array for detecting codon pairs at specific codon positions. Initially, a library of complexes comprising identifiers with framing sequences is provided. The framing sequence is specific for each position of the codons on the identifier. Four times more probes on the microarray is needed per each codon if the position of the codons also should be detected in the analysis which is practically feasible with current technology. The position is detected due to the framing sequences next to each codon. The initial library is subjected to a selection process to isolate complexes (37) having a desired property. The selected complexes are amplified by a PCR reaction and the reaction products are added to an array (38). The array comprises probes capable of detecting codon pairs as wells as the framing sequences (40) between the codons. The framing sequence determines the position of the codon in the reaction history, i.e. it is possible to deduct which chemical entity that reacted at which point in time of the synthesis history of the encoded molecule, thus making it possible to reconstruct the structure of the encoded molecule.
  • The detection of the codon pairs may be conducted quantitatively, i.e. the relative abundance of each of the codon pairs may be determined. The detection on the array may be used to reconstruct the selected identifiers (41) as three overlapping codon pair detections depict the entire identifier. In the event the same codon pair appears on more than one identifier, the information on the relative abundance of each codon pair maybe used to decipher the sequence of codons of the selected identifiers as it can be assumed that each codon pair of the same identifier appears in the same amounts in the PCR products added to the array.
  • FIG. 14 shows an array detection system in which a single codon is detected. Initially a library of selected complexes (42), i.e. complexes comprised of the initial library which display a certain property, is provided as disclosed above. The initial library of complexes is prepared from e.g. 100 codons and identifiers having 4 codons in sequence, which theoretical gives a library of 108 complexes. The selected complexes are subjected to amplification to amplify the identifiers of the selected complexes and the amplification products are treated with suitable reagents to cut between the individual codons (43). The individual codon is the applied to the array. The array (44) comprises probes (45) complementary to each of the codons of the identifiers (46). At hybridisation conditions the PCR products of the identifiers are annealed to the cognate probes of the array and in a suitable scanner the spatial position of the annealed probes are detected to elucidate the codons (47) of the identifier. The quantity of each codon may be measured to find codons abundant in more than one identifier and/or codons leading to encoded molecules with high affinity. The information may be used for decoding of the encoded molecule of the complexes displaying the desired property or the information may be used for selection of building blocks, which is to be added in a next round of library formation.
  • FIG. 15 discloses a method for selection of a suitable complex in several steps. In a first step the library of complexes 1 is provided. Each member of the library comprises an encoded molecule 2 composed of four chemical entities which is attached to an identifier oligonucleotide 3, which comprises four codons. The initial library shown comprises three complexes. In a second step the library of complexes is incubated with immobilized target molecules 4. The encoded molecule having an affinity towards the target molecule is bound to the immobilized target whereas encoded molecules not having affinity towards the target under the conditions used remains in the liquid media. The complexes remaining in the liquid media are discarded by a washing process, while the bound complexes remain attached to the immobilized target molecules. The washing process is usually conducted using mild stringency conditions in the initial rounds of selection. In later stage selections the working stringency conditions are usually increased to allow only high affinity binders to remain attached to the target. Subsequent to the washing step the complexes having affinity towards the target molecule are recovered. The recovery process usually requires high stringency conditions to detach the encoded molecule from the immobilized the target. The selected sub-library resulting from the elution is subjected to an amplification process. The amplification of the identifier nucleic acid sequence of the selected complexes is usually performed using the PCR method. Preferably, a modification of the PCR method is followed such that a biotin molecule is attached to one of the primers to obtain a handle for subsequent immobilization. The result of the amplification step is multiple copies of the identifier nucleic acid sequences, which codes for the encoded molecules which have survived the selection step.
  • FIG. 16 discloses an enrichment process of building blocks. The building blocks can be used for generation of a new library. Initially, identifier nucleic acid sequences are immobilized on solid support. In one aspect of the invention the identifier nucleic acid sequences are the product of the selection procedure described in FIG. 1. Each codon of the identifier nucleic acid sequence is identified with an uppercase letter, i.e. A, B, C, or D. The immobilized identifier acid sequences are contacted with the pool of building blocks under hybridisation conditions. Each of the building blocks are illustrated with an sequence complementary to a codon which may or may nor be present on the identifier nucleic acid sequence. The complementary sequences are indicated with a apostrophe, e.g. A′, B′, etc. The transferable chemical entity of a building block is illustrated with a lowercase letter. The conditions providing for hybridisation of the complementing sequences of the pool of building blocks to the immobilised identifier nucleic acid sequence are preferably such that cognate nucleic acid sequences are hybridised to each other while sequences not recognizing any immobilized sequence remain in aqueous media. The immobilized sequences of the identifier nucleic acid sequences are thus used as bait in catching building blocks with complementing sequences. Following the incubation step, non-binding building blocks are removed by washing, whereby the part of the pool of building blocks not being able to find a complementing sequence is discarded. The building blocks attached to the immobilized nucleic acid sequences are detached using dehybridisation conditions. The diminished pool of building blocks may be used in a subsequent round for preparing a new library of complexes, in which the encoded molecule comprises a reaction product comprising additions from chemical entities attached to the enriched building blocks. Because the order of building blocks which have participated in the formation of the encoded molecules successful in the selection procedure, is not preserved by the method for enriching building blocks a scrambling of the encoded molecules may be obtained in some of the methods described herein for obtaining a library of complexes. In some applications of the library it will be an advantage to have a scrambling of the building blocks because and increased diversity is obtained.
  • FIG. 17 discloses a method for reducing the diversity of the library of complexes resulting from the method described in FIG. 16. In some of the applications of the library the diversity induced by scrambling of the building blocks are not desired. In a first step the sequences complementary to the identifier acid sequences used in FIG. 16 are provided and immobilized on a suitable solid support. In one aspect of the invention the complementary sequence is obtained from the PCR product resulting from the method according to FIG. 15. Alternatively, the complementing sequence may be obtained by extending the identifier nucleic acid sequence using a suitable primer, optionally attached to a handle such as a biotin or dinitrophenol. In a second step the immobilized complementary sequence is incubated with the scrambled library under conditions, which provide for hybridisation between the complementary sequence and members of the library having affinity towards this sequence. Members of the library not having affinity to the complementary sequences remains in the media and is discarded, while members of the library being able to hybridise to the immobilized nucleic acid sequences is recovered. Occasionally, nucleic acids not perfectly matching with the complementary sequence immobilized on the solid support are caught. In one aspect of the invention the hybridisation products, prior to the recovery step, are treated with an enzyme capable of recognizing mismatching nucleotides and cleaving the double stranded helix in which they are situated. An example of an enzyme with this ability is T4 endonuclease VII. After the treatment with the enzyme, complexes displaying a hybridisation toward the immobilized sequence are eluted under dehybridisation conditions. Nucleotide sequences remaining from the cleavage by the enzyme will also be present in the new library, however, these sequences will not have any effect of a subsequent selection because no molecule is attached thereto.
  • EXAMPLES Example 1 Enrichment of Nucleic Acid Fragments
  • A codon was included in the oligonucleotide sequence shown below. The codon is underlined and the boldface sequences represent the “framing” regions next to each codon. These framing regions can be used for specifying the position of each codon.
    Biotin-AATTCCGGAACATA CTAGTCAAC ATGA-3′ (SEQ ID NO:1)
  • This identifier oligonucleotide was immobilized on streptavidin beads using standard protocols, i.e. 600 pmol identifier oligonucleotide with 5′-dT biotin in 50 μl 100 mM Mes pH 6.0 was mix with 50 μl SA-magnetic beads (Roche). The mixture was washed 2-3 times with 100 mM MES pH 6.0 to remove non-bound identifier oligonucleotides. To reduce background binding, the oligos and beads was incubated at RT for 10 min on shaker, then incubated on ice for 10 min while rotating the tube. Finally, the sample was washed with 100 mM MES 4 times in 800 μl at 60° C.
  • In the case where a PCR product is immobilized, the complementing (non-sense) strand is removed using 10 mM NaOH. This will generate single-stranded DNA with the selected codons. The same procedure described in this example can be used for a collection of different identifier nucleic acid molecules that contain one or more codons. The codons in the identifier nucleic acid molecules can be the same or different determined from the enrichment performed on the initial library.
  • The immobilized identifier nucleic acid molecule was mixed with the pool of nucleic acid fragments shown below. This pool of fragments illustrates an original pool that was used for generating an initial library of complexes. Each fragment may possess in the 3′-end a specific chemical entity that is encoded by the codon sequence. These nucleic acid fragments contain a specific sequence in the codon region (underlined) while the framing region shown in boldface is identical among the fragments. Thus, the pool of fragments represents different codons in the same position of the identifier nucleic acid.
    1. CGT GTG ATC GAA CTC GTG TG GTAT GATCAGTTG TACT-5′
    (SEQ ID NO:2)
    2. CGT GTG ATC GAA CTC GTG TG GTAT CTAGTCGGT TACT-5′
    (SEQ ID NO:3)
    3. CGT GTG ATC GAA CTC GTG TG GTAT TCGAGTGTT TACT-5′
    (SEQ ID NO:4)
    4. CGT GTG ATC GAA CTC GTG TG GTAT AGCTCATGG TACT-5′
    (SEQ ID NO:5)
  • The nucleic acid fragments are mixed with the immobilized identifier nucleic acid using 600 pmol of each nucleic acid fragment mixed with the immobilized identifier nucleic acid molecules (100 mM MES pH 6.0, 150 mM NaCl)). The mixture was incubated at 25° C. for 30 minutes in a shaker. The non-hybridized fragments were removed by 4 times washing in 800 μl 100 mM MES, 150 mM NaCl. This step should separate the complementing fragments (bound) encoding for the select chemical entities from the non-complementing fragments (non-bound) encoding for chemical entities that were not effective in the preceding selection process. The annealed fragments are eluted from the immobilized identifier nucleic acid molecules by re-suspending the beads in 25 μl 60° C. H2O and incubating for 2 min at 60° C. The enriched fragments were purified on a micro-spin gel filtration column (BiRad). The eluted fragments were prepared for mass spectroscopy (MS) analysis by mixing in half volume of ion exchanger resin and incubating minimum 2 h at 25° C. on a shaker. After incubation the resin was removed by centrifugation and 15 μl of the supernatant was mixed with 7 μl of water, 2 μl of piperidine and imidazole (each 625 mM) and 24 μl acetonitrile. The sample was analysed using a Mass Spectroscopy instrument (Bruker Daltonics, Esquire 3000plus). The result for the MS analysis is shown below.
    Figure US20070026397A1-20070201-P00001
  • The mass from the correct complementary fragment (number 1) is obtained in the MS analysis (11438.39, expected 11439 Da) No masses for the other fragments (number 2-4) could not be found in the MS spectra (expected masses; 11415, 11430, 11424 Da). This result shows that the right fragment is strongly enriched and other fragments with the wrong codon sequences are removed. The enrichment is possible even when the “spacing” region (boldface) is identical in each fragment. Two control experiments were also performed to validate the enrichment protocol. In the first experiment, the fragment with the correct codon sequence (number 1) was mixed with the immobilized identifier molecule as described above. The sample was washed end eluted also as described above and prepared for MS analysis. The result from the MS analysis is shown below.
    Figure US20070026397A1-20070201-P00002
  • The result indicates that the fragment with the correct sequence (number 1) anneals to the immobilized identifier molecules and is eluted under the conditions used in this example. The expected mass (11439) correlate well with the experimental mass, 11438.39 Da.
  • In the other control experiment, a fragment with a wrong codon sequence (number 3) was allow to bind to the immobilized identifier molecule as described above. Again, the eluted sample was prepared and analysed with MS. The result is shown below.
    Figure US20070026397A1-20070201-P00003
  • In this experiment, no mass was found that corresponded to the expected mass (11430) of the tested building block (number 1). Again, this shows that fragments with a anticodon sequence different from the enriched codons in the identifier nucleic acid molecules are not captured using this approach.
  • The enriched fragments obtained using this strategy may then be used to generate a new library of encoded molecules. This new library will contain encoded molecules composed of the enriched chemical entities. Thus, the library size have been reduced due to the removal of chemical entities not involved in binding encoded molecules, and enriched in chemical entities that are highly represented in the encoded molecules which binds to the target molecule.
  • Example 1 shows the possibility of enriching for specific building block molecules, i.e. nucleic acid fragments associated with transferable chemical entities. The same procedure can be used for a larger pool of building block than four as used herein. The codon design will determine the maximum number of building blocks that can be used. The sequence in the codon region should be large enough to allow discrimination in the annealing step. Various conditions can be used to increase the stringency in the annealing step. Parameters such as temperature, salt, pH, formamide concentration, time and other conditions could be used.
  • Example 2 (Model): Multiple Codon Selection in a Library
  • This example describes the enrichment of building blocks using an identifier nucleic acid (identifier) molecule with multiple codons. These codons encode for a displayed molecule (DM) that are attached to the identifier molecule before the selection is performed. The library size is determined both by the number of different chemical entities and the total number of chemical entities. The identifier molecule shown below contains three codons. The codons, which codes for the displayed molecule are indicated with underlines and the region separating (framing region) the codons in boldface. The size of the codons can be varied dependent in the diversity need in the library and the optimal setup for chemical entity enrichment. The framing region can also be varied dependent on the discrimination needed to distinguish the precise position of a codon in the identifier molecule. The framing region will also be important for the generation of the library. This can be understood when the encoding is accomplished by extension of the encoding region as disclosed in DK PA 2002 01955 and U.S. No. 60/434,425, incorporated herein by reference. There need to be a perfect match in the 3′-end in order to get efficient extension with a polymerase or a ligase. The size of this spacing/framing region should be long enough to form a complementing region to allow extension with a polymerase or ligase. Preferably, the spacing region should be between 3 and 6 nucleotides. The codon region together with the spacing region will also be useful when codons are to be identified using a micro array setup. The identifier molecule with the right codon sequences will hybridize to the array and be detected.
  • The sequence below represents an enriched identifier molecule attached to the displayed molecule (DM). This identifier molecule has been enriched due to the fact that the DM binds to the target molecule in the selection process. In practice, more than one enriched identifier molecules will be obtained when using a library of displayed molecules attached to its identifier sequence.
    DM-GCACACTAGCTTGAGCACACTGACACAT GGAGATCAC ATG CTTCGAC
    AA TGC AGGACTCCC GCAGCTTTACGATCCCGCAGGTAACCGT
  • This identifier molecule is amplified with two primers (below) using a standard PCR reaction. For example, 500 nM of each primer, 2,5 units Taq polymerase, 0.2 mM of each NTP, in a PCR buffer (50 mM KCl, 10 mM Tris-Cl, 3 mM DTT, 1.5 mM MgCl2, 0.1 mg/ml BSA). Run 25 cycles (94° C. melt for 30 seconds, 55° C. anneal for 45 seconds, 72° C. extension for 60 seconds).
    B-GCACACTAGCTTGAGCACACTGACA-3′
      CGAAATGCTAGGGCGTCCATTGGCA-5′
  • This will amplify the identifier molecule from the selection process and add a biotin in the 5′-end of one of the strand (below). This amplified product is then immobilized on a solid support, streptavidin beads for example. This can be performed identical as describe in example 1.
  • When the identifier molecules have been immobilized and the excess has been removed by a washing step (as describe in example 1), the complementing non-sense stand is removed by incubating in 10 NaOH for about 2 min and washed with 100 mM Mes buffer, pH 6.0. This procedure will generate the strand shown below where the codon regions are exposed to allow hybridization with the complementing sequences.
    B-GCACACTAGCTTGAGCACACTGACACAT GGAGATCAC ATG CTTCGACA
    A TGC AGGACTCCC GCAGCTTTACGATCCCGCAGGTAACCGT
  • The next step is to protect the complementing sequences outside the codons to prevent the binding of the building block to these sequences. This can be performed by adding “blocking” oligonucleotides that has a complementing sequence. This is shown below.
    Figure US20070026397A1-20070201-C00012
    Figure US20070026397A1-20070201-C00013
  • Next, the pool of different building blocks is added and is allowed annealing to the codon region in the identifier region. The position of annealing is determined by the spacing region shown in boldface. The stringency is adjusted to only allow hybridization of the correct building block in the right position. This can be accomplished by mixing the right component together using various conditions. The condition can for example include the presence of salt, formamide and various buffers adjusted to suitable pH and temperature. Below is the correct building block that will anneal to the enriched identifier molecules. These building blocks is annealed and eluted as described in example 1.
    CE-CGTGTGATCGAACTCGTGTGACTGTGTACCTCTAGTGTAC
  • The next pool of building blocks is blocked with an oligonucleotide that also protects the first codon. This is necessary to prevent binding of the building blocks in that codon.
    Figure US20070026397A1-20070201-C00014
    Figure US20070026397A1-20070201-C00015
  • Again, the library of building blocks is added to enrich for the selected codons. Below is the building block with the correct sequence. These building blocks is annealed and eluted as described in example 1.
    CE--CGTGTGATCGAACTCGTGTGACTGTGTAIIIIIIIIITACGAAGCT
    GTTACG
  • Finally, the identifier molecule is protected with a blocking oligo that expose only the last codon.
    Figure US20070026397A1-20070201-C00016
    Figure US20070026397A1-20070201-C00017
  • A new pool of building blocks is added and allowed hybridizing to the identifier molecule. These building blocks is annealed and eluted as described in example 1.
    CE--CGTGTGATCGAACTCGTGTGACTGTGTAIIIIIIIIITACIIIIII
    IIIACGTCCTGAGGGCGT
  • The enrichment of each library of building blocks are performed in separate tubes in order to keep the libraries of building block separated. The enrichment is performed with building blocks loaded with chemical entities (CE).
  • Example 3 Template Versus Chemical Entity Evolution
  • The graph below illustrates the relationship between the number of chemical entitles and the library size. The example below is calculated on that the final encoded molecules contains four chemical entities that is individually encoded by the corresponding building block (n4, where n is the number of building blocks). The graph shows that the diversity decreases dramatically with the reduction of the total number of building blocks. If the number of different building can be reduced to about 20-30 (library size of 16*103 and 81*104, respectively) in the selection process, then the library size for the final round of selection is low enough for identification of the binding molecules.
    Figure US20070026397A1-20070201-P00004
  • When the same analysis is performed on a protein another situation is obtained. The example shown below is on a very small protein (50 amino acids in length). The diversity is enormous when all amino acids are included in the library. The size of the library is also decreasing with the total number of amino acids, but not to the same extent as show above for a small molecule. Even when the different amino acids are reduced to 2, the library size is huge (1.2 1015). This shows that amino acid enrichment is impossible on protein. This is even more pronounced for mid-size protein which contains about 300 amino acids.
    Figure US20070026397A1-20070201-P00005
  • Example 4 Codon Analysis
  • This example illustrates one possibility to perform codon analysis on a whole population of different identifier oligonucleotides. The analysis can also be performed using array where the probe oligonucleotides (complementary to the codons) are immobilized in discreet areas and the signal is monitored dependent on the amount of identifiers oligonucleotides are hybridised in each specific area. Codon analysis can also be performed using standard sequencing using a polymerase extension step.
  • In FIG. 5, Four codons are shown (P1 through P4; bold pattern) along with flanking regions (light pattern). A universal Taqman probe anneals to a region adjacent to the codon region, but within the amplicon defined by the universal PCR primers Pr.1 and Pr. 2. These primers could be the same as used for amplification of the identifier oligonucleotides encoding binders after an enrichment process on a specific target. However, are minimal length identifiers preferred during the encoding process, the region involved in Taqman probe annealing could be appended to the library identifier oligonucleotides by e.g. overlap PCR, ligation, or by employing a long downstream PCR primer containing the necessary sequences. The added length corresponding to the region necessary for annealing of the Taqman probe would be form 20 to 40 nts depending on the type of TaqMan probe and TA of the PCR primers. The Q-PCR reactions are preferably performed in a 96 or 384-well format on a real-time PCR thermocycling machine.
  • FIG. 5, panel A, shows the detection of abundance of a specific codon sequence in position one. Similar primers are prepared for all codon sequences. For each codon sequence utilized to encode a specific BB in the library a Q-PCR reaction is performed with a primer oligonucleotide complementary to the codon sequence in question. A downstream universal reverse primer Pr. 2 is provided after the Taqman probe to provide for an exponential amplification of the PCR amplicon. The setup is most suited for cases where the codon constitutes a length corresponding to a length suitable for a PCR primer.
  • FIG. 5, panel B shows the detection of abundance of a specific codon sequence in a specific codon position using a primer which is complementing a codon and a framing sequence. Similar primers are used for all the codons and framing sequences. For each codon sequence utilized to encode a specific BB at a specific codon position in the library a Q-PCR reaction is performed with an oligo complementary to the codon sequence in question as well as a short region up- or downstream of the codon region which ensures extension of the primer in a PCR reaction only when annealed to the codon sequence in that specific codon position. The number of specific primers and Q-PCR reactions needed to cover all codon sequences in all possible codon positions equals the number of codon sequences times the number of codon positions. Thus, monitoring the abundance of 96 different codon sequences in 4 different positions can be performed in a single run on four 96 wells micro titre plates (as shown in FIG. 5, panel B) or a single 384 well plate on a suitable instrument. This architecture allows for the decoding of a 8.5×107 library of different encoded molecules.
  • Quantification is performed relative to the amount of full-length PCR product obtained in a parallel control reaction on the same input material performed with the two external PCR primers Pr.1+Pr. 2. Theoretically, a similar rate of accumulation of this control amplicon compared to the accumulation of a product utilizing a single codon+sequence specific primer would indicate a 100% dominance of this particular sequence in the position in question.
  • Although the setups shown in FIG. 5, panel A and B employ a Taqman probe strategy, other detection systems (SYBR green, Molecular Beacons etc.) could be utilized. In theory, multiplex reactions employing up to 4 different fluorofors in the same reaction could increase throughput correspondingly.
  • An example of how a deconvolution process of a library of encoded molecules occurs is described in the following. Imagine that at the end of a selection scheme a pool of 3 ligand families (and the corresponding coding identifiers) are dominating the population and present at approx. the same concentration. Three different chemical entities are present in the first position of the encoded compounds, and each of these chemical entities are present in combination with one unique chemical entity out of 3 different chemical entities in position P2. Only one chemical entity in position 3 gives rise to active binders, whereas any of a 20% subset of chemical entities (e.g. determined by charge, size or other characteristica) are present in position 4. The outcome of the initial codon profile analysis would be: 3 codon sequences are equally dominating in position P1, 3 other codon sequences in position P2, 1 unique codon sequence is dominant in P3 whereas somewhat similarly increased levels of 20% of the codon sequences (background levels of the remaining 80% sequences) are seen in P4. In such cases it could be relevant to use an iterative Q-PCR (“IQ-PCR”) strategy to perform a further deconvolution of a library after selection. Again with reference to the example above, by taking the PCR products from the 3 individual wells that contained primers giving the high yields in position P1, diluting the product appropriately and performing a second round of Q-PCR on each of these identifier oligonucleotides separately, it would be possible to deduce which codon sequence(s) is preferred in P2 when a given codon sequence is present in P1.
    Identifiers used for Q-PCR quantification
                                                  P1                  P2
    5′-CAGCTTGGACACCACGTCATACTAGCTGCTAGAGATGTGGTGATATTAGTGTGTGACGATGGTACGCACA
                                   GGAAGAAGACAGAAGACCTG
                                   TCAGGAGTCGAGAACTGAAG
                                   TGTGTACGTCAACACGTCAG
                                   TGTGGAACTACCATCCAAGG
                                   CCATCCAACATCGTTGGAAG
                                   AACCTGTCCTGTGAGATCTG
                                   TCACGAAGCTGGATGATGAG
                                   TAGCATCGATCGAACGTAGG
                                   TCGAAGCTACTGTCGAGATC
                  P3                        P4
    AGTACGAACGTGCATCAGAGAGGACGAGCAGGACCTGGAACCTGGTGC*TTCCTCCACCACGTCTCTGAC-3′
                                  CTCGACCACTGCAGGTGGAGCTCC
                                  CGTGCTTCCTCTGCTGCACCACCG
                                  CCTGGTGTCGAGGTGAGCAGCAGC
                                  CTCGACGAGGTCCATCCTGGTCGC
                                  CGTGAGGAGCAGGTCCTCCTGTCG
                                  CCTGACACTGGTCGTGGTCGAGGC
                                  CCATCTCGACGACCTGCTCCTGGG
                                  CCACGAGGTCTCCACTGGTCCAGG
                                  CCACTGAGCTGCTCCTCCAGGTGC
    Oligos for Identifier synthesis:
    FPv2: CAGCTTGGACACCACGTCATAC
    RPv2: GTCAGAGACGTGGTGGAGGAA
    Temp1-1: CAGCTTGGACACCACGTCATACTAGCTGCTAGAGATGTGGTGATATTAGTGTGTGACGAT
    Temp1-2: CAGCTTGGACACCACGTCATACGGAAGAAGACAGAAGACCTGATATTAGTGTGTGACGAT
    Temp1-3: CAGCTTGGACACCACGTCATACTCAGGAGTCGAGAACTGAAGATATTAGTGTGTGACGAT
    Temp1-4: CAGCTTGGACACCACGTCATACTGTGTACGTCAACACGTCAGATATTAGTGTGTGACGAT
    Temp1-5: CAGCTTGGACACCACGTCATACTGTGGAACTACCATCCAAGGATATTAGTGTGTGACGAT
    Temp1-6: CAGCTTGGACACCACGTCATACCCATCCAACATCGTTGGAAGATATTAGTGTGTGACGAT
    Temp1-7: CAGCTTGGACACCACGTCATACAACCTGTCCTGTGAGATCTGATATTAGTGTGTGACGAT
    Temp1-8: CAGCTTGGACACCACGTCATACTCACGAAGCTGGATGATGAGATATTAGTGTGTGACGAT
    Temp1-9: CAGCTTGGACACCACGTCATACTAGCATCGATCGAACGTAGGATATTAGTGTGTGACGAT
    Temp1-10: CAGCTTGGACACCACGTCATACTCGAAGCTACTGTCGAGATGATATTAGTGTGTGACGAT
    Temp2: GTCCTCTCTGATGCACGTTCGTACTTGTGCGTACCATCGTCACACACTAATATC
    Temp3-1: GAACGTGCATCAGAGAGGACGAGCAGGACCTGGAACCTGGTGCAATTCCAGCTTCTAGGAAGACT
    Temp3-2: GAACGTGCATCAGAGAGGACTCGACCACTGCAGGTGGAGCTCCAATTCCAGCTTCTAGGAAGACT
    Temp3-3: GAACGTGCATCAGAGAGGACGTGCTTCCTCTGCTGCACCACCGAATTCCAGCTTCTAGGAAGACT
    Temp3-4: GAACGTGCATCAGAGAGGACCTGGTGTCGAGGTGAGCAGCAGCAATTCCAGCTTCTAGGAAGACT
    Temp3-5: GAACGTGCATCAGAGAGGACTCGACGAGGTCCATCCTGGTCGCAATTCCAGCTTCTAGGAAGACT
    Temp3-6: GAACGTGCATCAGAGAGGACGTGAGGAGCAGGTCCTCCTGTCGAATTCCAGCTTCTAGGAAGACT
    Temp3-7: GAACGTGCATCAGAGAGGACCTGACACTGGTCGTGGTCGAGGCAATTCCAGCTTCTAGGAAGACT
    Temp3-8: GAACGTGCATCAGAGAGGACCATCTCGACGACCTGCTCCTGGGAATTCCAGCTTCTAGGAAGACT
    Temp3-9: GAACGTGCATCAGAGAGGACCACGAGGTCTCCACTGGTCCAGGAATTCCAGCTTCTAGGAAGACT
    Temp3-10: GAACGTGCATCAGAGAGGACCACTGAGCTGCTCCTCCAGGTGGAATTCCAGCTTCTAGGAAGACT
    Temp4: GTCAGAGACGTGGTGGAGGAAGTCTTCCTAGAAGCTGGAATT

    Taqman MGB probe binding region: * = AATTCCAGCTTCTAGGAAGAC
  • Synthesis of Identifier Oligonucleotides:
  • The 10 identifier oligonucleotides were assembled in 10 seperate 50 μl PCR reactions each containing 0.05 pmol of the oligos Q-Temp1-X, Q-Temp2, Q-Temp3-X and Q-Temp4 (x=1 through 10) and 25 pmol of the external primers FPv2 and RPv2 with TA=53° C. The 160 bp products were gel-purified using QlAquick Gel Extraction Kit from QIAGEN (Cat. No. 28706) and quantified on spectrophotometer. As a control, 20 ng of each of the identifiers (as estimated from these measurements) were loaded on an agarose gel.
  • Preparation of Samples for Q-PCR:
  • Sample A Generated by mixing 20 ng from each identifier oligonucleotide prep. Volume was adjusted to 50 μl. Concentration: 4 ng/μl=38.46 fmol/μl (160 bp×650 Da/bp=1.04×105 g/mol. 1 ng=9.615 fmol). Diluted to 107 copies/5 μl (0.00332 fmol/μl).
  • Sample B: 20 ng/20 μl stocks of each identifier were prepared. The sample was mixed as follows:
  • 5 μl undil. Identifier #10
  • 5 μl 2× dil. Identifier #9
  • 5 μl 4× dil. Identifier #8
  • 5 μl 8× dil. Identifier #7
  • 5 μl 16× dil. Identifier#6
  • 5 μl 32× dil. Identifier #5
  • 5 μl 64× dil. Identifier #4
  • 5 μl 128× dil. Identifier #3
  • 5 μl 256× dil. Identifier #2
  • 5 μl 512× dil. Identifier #1
  • Concentration: 10 ng/50 μl=0.20 ng/μl=1.923 fmol/μl. Diluted 579.2-fold to 107 copies/5 μl (0.00332 fmol/μl).
  • Standard curve: The samples for the standard curve was prepared by diluting Sample A 116.55-fold to 109 copies/5 μl (0.33 fmol/μl) and subsequently performing a 10-fold serial dilution of this sample. 5 μl was used for each PCR reaction. The standard curve is shown in FIG. 2.
  • Q-PCR Reactions
  • For 5 ml premix (for one 96-well plate):
  • 2.5 ml Taqman Universal PCR Master Mix (Applied Biosystems; includes Taq polymerase, dNTPs and optimized Taq pol. buffer)
  • 450 μl RPv2 (10 pmol/ul)
  • 25 μl Taqman probe (6-FAM-TCCAGCTTCTAGGAAGAC-MGBNFQ; 50 μM; Applied Biosystems)
  • 1075 μl H2O
  • 40.5 μl premix was aliquoted into each well and 4.5 μl of relevant upstream PCR primer (FPv2 (for standard curve) or one of the codon specific primers listed below; 10 pmol/μl) and 5 μl sample (H2O in wells for negative controls) was added. The codon-specific PCR primers were: (Tm calculations shown are from Vector NTI; matched to Tm for RPv2 (67.7° C.))
    P1-1: GTCATACTAGCTGCTAGAGATGTGGTGATA 66.8° C.
    P1-2: CATACGGAAGAAGACAGAAGACCTGATA 67.8° C.
    P1-3: TCATACTCAGGAGTCGAGAACTGAAGATA 67.6° C.
    P1-4: CATACTGTGTACGTCAACACGTCAGATA 67.4° C.
    P1-5: CATACTGTGGAACTACCATCCAAGGATA 68.0° C.
    P1-6: CCATCCAACATCGTTGGAAGAT 67.8° C.
    P1-7: CATACAACCTGTCCTGTGAGATCTGATA 67.7° C.
    P1-8: ATACTCACGAAGCTGGATGATGAGATA 67.3° C.
    P1-9: CATACTAGCATCGATCGAACGTAGGATA 68.1° C.
    P1-10: TCATACTCGAAGCTACTGTCGAGATGATA 68.2° C.
    P2-1: ATATTAGTGTGTGACGATGGTACGCA 67.8° C.
    P3-1: ACAAGTACGAACGTGCATCAGAGA 67.7° C.
    P4-1: CGAGCAGGACCTGGAACCT 67.7° C.
    P4-2: TCGACCACTGCAGGTGGA 68.3° C.
    P4-3: GCTTCCTCTGCTGCACCA 66.7° C.
    P4-4: GGTGTCGAGGTGAGCAGCA 69.1° C.
    P4-5: CGACGAGGTCCATCCTGGT 68.6° C.
    P4-6: GTGAGGAGCAGGTCCTCCTGT 68.0° C.
    P4-7: CTGACACTGGTCGTGGTCGA 68.8° C.
    P4-8: CATCTCGACGACCTGCTCCT 67.9° C.
    P4-9: ACGAGGTCTCCACTGGTCCA 68.3° C.
    P4-10: ACTGAGCTGCTCCTCCAGGT 66.5° C.
  • Thermocycling/measurement of fluoresence was performed on an Applied Biosystems ABI Prism 7900HT real-time instrument utilizing the standard cycling parameters:
  • 95° C. 10 min;
  • 40 cycles of
  • 95° C. 15 sec;
  • 60° C. 1 min
  • All samples were run in duplicate.
  • Results
  • FIG. 6 shows the standard curve calculated by the 7900HT system software. The log of the starting copy number was plotted against the measured CT value. The relationship between CT and starting copy number was linear in the range from 10 to 109 identifier copies.
  • This standard curve was utilized by the system software to calculate the quantity in the “unknown” samples as shown below.
    TABLE I
    Sample A (Shown graphically in FIG. 7)
    Sample A:
    Equimolar
    ratios Observed A Observed B Expected
    FPv2 12539947.00 11977503.00 10000000
    P1-1 445841.90 480382.03 1000000
    P1-2 884840.70 847478.56 1000000
    P1-3 1013073.56 948770.00 1000000
    P1-4 764187.94 741304.40 1000000
    P1-5 1352874.60 1275155.50 1000000
    P1-6 1284075.60 1337928.50 1000000
    P1-7 658161.80 747371.56 1000000
    P1-8 742187.20 653874.00 1000000
    P1-9 824587.75 705785.75 1000000
    P1-10 813550.75 836037.90 1000000
    P2-1 13145159.00 14482606.00 10000000
    P3-1 13263911.00 12773780.00 10000000
    P4-1 1430704.80 1472576.80 1000000
    P4-2 2681652.00 2481824.80 1000000
    P4-3 1933106.80 2085476.40 1000000
    P4-4 1359684.40 1364621.40 1000000
    P4-5 2206709.80 2065813.60 1000000
    P4-6 1652718.10 1873777.20 1000000
    P4-7 1468208.10 1416153.00 1000000
    P4-8 1664467.50 1581067.00 1000000
    P4-9 1462520.60 1594593.80 1000000
    P4-10 2020088.20 1912277.40 1000000
  • TABLE II
    Sample B (Shown graphically in FIG. 8)
    Sample B:
    2-fold dil. Observed A Observed B Expected
    FPv2 4.97E+06 5.05E+06 10000000
    P1-1 9955.07 10899.97 9765.625
    P1-2 12732.32 13469.12 19531.25
    P1-3 25542.8 25419.85 39062.5
    P1-4 34748.89 44070.81 78125
    P1-5 110881.41 123734.13 156250
    P1-6 163687.44 166220.5 312500
    P1-7 156993.81 172005.64 625000
    P1-8 343176.78 374809.13 1250000
    P1-9 646619.44 576151 2500000
    P1-10 1.49E+06 1.72E+06 5000000
    P2-1 5.19E+06 5.37E+06 10000000
    P3-1 5.29E+06 5.09E+06 10000000
    P4-1 (no signal) 70223.8 9765.625
    P4-2 42103.32 22733.17 19531.25
    P4-3 54480.62 39663.62 39062.5
    P4-4 51293.07 43950.9 78125
    P4-5 137946.95 115027.34 156250
    P4-6 174134.64 156442.55 312500
    P4-7 316505.78 283856.84 625000
    P4-8 737661.44 691296.75 1250000
    P4-9 1.42E+06 1.45E+06 2500000
    P4-10 3.72E+06 3.52E+06 5000000
  • The results of the experiments show the possibility of accurately quantification of identifier oligonucleotides down to or even below 10 copies with a 9 fold dynamic range, and reliable relative quantification of the tested codons in various positions in the identifier oligonucleotide.
  • Example 5 Codon Analysis
  • Another possibility to analyse codons in identifier oligonucleotides is to use array format with attached probe oligonucleotides.
  • Six adaptors with the different anti-codon sequences in all three positions were designed. All the adaptors contain a probe binding sequence (20 nucleotides) that allows discrete binding on the microarray. Probe design is known in the art. Adaptors harbouring one to three deletions in the spacing region were used as negative controls to ensure that only the framing region is responsible for the hybridization of the identifier. Thus, the negative controls contain another framing sequence. The identifier oligonucleotide harbours the complementing codon sequence and the position directing framing regions.
    Adaptor oligonucleotides
    3′ CTCATCGGAAGGGCTCGTAACGG TGGGTTTGGG GGC TGGGTTTGGGG
    CGTGGGTTTGGGCGG-5′
    3′ TTTGGTAGCTGAGTGCCCTAGGCTGGGTTTGGG CGG TGGGTTTGGG G
    GC TGGGTTTGGGGCG-5′
    3′ TAACTGGTTTGACGCCACGCGCGTGGGTTTGGGGCGTGGGTTTGGG C
    GG TGGGTTTGGG GGC-5′
    3′ TAATTGAGCTGACGGCGCACGGCTGGGTTTGGG CGTGGGTTTGGG GC
    TGGGTTTGGGGCG-5′
    3′ TGTTGCTACTCTGGCCCGAGGCTGGGTTTGGG C TGGGTTTGGG C TGG
    GTTTGGGGCG-5′
    3′ ACGGGATAACAACGCAGCCTGGCTGGGTTTGGGTGGGTTTGGGTGGG
    TTTGGGGCG-5′
  • Identifier Oligonucleotide
    Biotin-5′ GCC ACCCAAACCC CCG
  • GenFlex hybridisation and scanning. Prior to hybridization, the Adaptor mix (100 pM final concentration for each of the adaptor oligonucleotides) in a hybridization buffer (100 mM MES, 1 M NaCl, 20 mM EDTA, 0,01% Tween 20, 1× Denhardt's), was heated to 95° C. for 5 min and subsequently cooled and maintained at 40° C. for 5 min before loading onto the Affymetrix GenFlex probe array cartridge. The probe array was then incubated for 2 h at 45° C. at constant rotation (60 rpm). The remaining Adaptor mix was removed from the GenFlex cartridge, and replaced with the identifier in a hybridization buffer (100 mM MES, 1 M NaCl, 20 mM EDTA, 0,01% Tween 20, 1× Denhardt's). The identifier hybridisation mix was heated to 95° C. for 5 min and subsequently cooled and maintained at 40° C. for 5 min before loading onto the Affymetrix GenFlex probe array cartridge and hybridised for 2 h at 45° C. at constant rotation (60 rpm). The washing and staining procedure was performed in the Affymetrix Fluidics Station. The probe array was exposed to 2 washes in 6×SSPE-T at 25° C. followed by 12 washes in 0.5×SSPE-T at 40° C. The biotinylated Identifier oligonucleotide was stained with a streptavidin-phycoerythrin conjugate, final concentration 2 μg/μl (Molecular Probes, Eugene, Oreg.) in 6×SSPE-T for 10 min at 25° C. followed by 6 washes in 6×SSPE-T at 25° C.
  • The probe arrays were scanned at 560 nm using a confocal laser-scanning microscope with an argon ion laser as the excitation source (Hewlett Packard GeneArray Scanner G2500A). The readings from the quantitative scanning were analysed by the Affymetrix Gene Expression Analysis Software. The results are depicted in Scheme 1.
    Scheme 1:
    Figure US20070026397A1-20070201-C00018
  • The Array analysis shows that the codons including the framing regions are able to distinguish between the different probe oligonucleotides. The designed probes will only detect codons with the correct framing region allowing distinguishing first of the right codon and secondly as to which position the codon is positioned. Only one deletion in both framing regions reduces significantly the hybridization of the identifier. Thus, the framing sequence may be used to obtain information about the position of a specific codon and the point in the reaction history when a given reaction of a chemical entity has occurred.
  • The information obtained in this example using either QPCR or array codon analysis as example can be used to generate a new more focused library. The signal from the QPCR analysis or the array analysis can directly be used to combine preferable chemical entities.
  • Example 6 Generation of a Second-Generation Library
  • The information obtained from a codon analysis performed according to the principles described in Examples 4 or 5 can be utilized for assembly a new more focused library. Sequence information can also be used to design a second-generation library with reduced diversity. This example illustrates how sequence data can be utilized to make a more focused library with the enriched chemical entities. Identical strategy can be based on the codon analysis methods described in Examples 4 or 5.
  • A 700-member library was generated composing of 4×25×7 chemical entities. The library generation protocol is described below with the sequence information and chemical entity structure.
  • General arrangement of each complex composed of display molecule and identifier oligonucleotide in the library generation:
    Figure US20070026397A1-20070201-C00019
  • Specific codons in each oligo (Ax, Bx, Cx) was used and can be designed by using a specific nucleotide sequence for each chemical entity. In this particular setup, two complementary oligonucleotides (e.g. oligo Ax and oligo ax) containing a particular codon are allow to hybridize before the ligation step. The ligation of each codon oligonucleotide in each position is ligated with that attachment of the encoded chemical entity.
  • Overview of the Library Generation Procedure:
  • First Round of Library Generation (Round A):
    Figure US20070026397A1-20070201-C00020
  • “Pnt” corresponds to pentenoyl—an amine protecting group. “R” can by any molecule fragment. The chemical used in library generation comprise a primary (shown) or a secondary amine.
  • Second Round of Library Generation (Round B):
    Figure US20070026397A1-20070201-C00021
  • Third Round of Library Generation (Round C):
    Figure US20070026397A1-20070201-C00022
  • General Procedure: Library Generation, Selection and Mismatch Subsequent Selection
  • First Round of Library Generation (Round A):
  • First oligonucleotides of the A series are each modified by adding to each type of oligo a small molecule building block (BBAX) to the 5′ amine forming an amide bond. After this step the identifier is comprised of oligo Ax.
  • Second Round of Library Generation (Round B):
  • 4 nmol of a mixture of different modified A oligos are then split into a number tubes corresponding to the number of different building blocks to be used in round B. 190 pmol Oligo a and 2 μl heering DNA is added to each tube and the DNA material in each tube is lyophilized. The lyophilized DNA is then redissolved in 50 μl water and purified by spining through Biospin P-6 columns (Biorad) equilibrated with water.
  • Addition of Building Block
  • The DNA material in each tube is again lyophilized and redissolved in 2 μl 100 mM Naborate pH 8.0/100 mM sulfo N-hydroxy succinimide (sNHS). For each tube 10 μl building block BBBX (100 mM in dimethyl sulfoxide [DMSO]) is preactivated by mixing with 10 μl 1-Ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC) (90 mM in dimethylformamide [DMF]) and incubating at 30° C. for 30 min. 3 μl of this preactivated mixture is then mixed with the 2 μl in each tube and allowed to react 45 min at 30° C. Then an additional 3 μl freshly preactivated BB is added and the reaction is allowed to proceed for 45 min at 30° C. The resulting mixture is then purified by spinning through Bio-Rad P6 DG (Desalting gel).
  • Addition of Codon Oligonucleotide
  • The DNA material is then lyophilized and redissolved in 10 μl water containing 200 pmol oligo Bx (eg. B1) and the corresponding oligo bx (eg. b1). This is done so that the codon in oligo Bx identifies the BBBX added to the DNA identifier. 10 units of T4 DNA ligase (Promega) and 1.2 μl T4 DNA ligase buffer is then added to each tube and the mixture is incubated at 20° C. for 1 hour. The DNAn identifier linked to the small molecules now comprises an Ax oligo with a Bx oligo ligated to its 3′ end. The reactions are then pooled, an appropiate volume of water is allowed to evaporate and the remaining sample is purified by spining through Biospin P-6 columns (Biorad) equilibrated with water.
  • Removal of Building Block Protecting Group
  • The pooled sample (˜50 μl) is adjusted to 10 mM Na-acetate (pH 5). 0.25 volumes of 25 mM Iodine in tetrahydrofuran/water (1:1) is added and the sample is incubate at 37° C. for 2 h. The reaction is then quenched by addition of 2 μl of 1M Na2S2O3 and incubation at room temperature for 5 min. The complexes are then purified by spining through Biospin P-6 columns (Biorad) equilibrated with water
  • To remove sulphonamide protecting groups, the sample is adjusted to 50 μl 100 mM sodium borate pH 8.5 and 20 μl 1 500 mM 4-methoxy thiophenol (in acetonitrile) is added and the reaction is incubated at 25° C. overnight. Then the complexes are purified by spinning through Biospin P-6 columns (Biorad) equilibrated with water and then lyophilized.
  • Third Round of Library Generation (Round C):
  • The samples are dissolved in 175 μl 100 mM Na-borate pH 8.0 and distributed into 25 wells (7 μl/well). 2 μl 100 mM BBcx in water/DMSO and 1 μl of 250 mM DMT-MM is added to each reaction and incubated at 30° C. overnight. Water is added to 50 μl and the reactions are then spin purified using Bio-Rad P6 DG (Desalting gel) and subsequently water is allowed to evaporate so that the final volume is 10 μl.
  • Addition of Building Block
  • The DNA material is then lyophilized and redissolved in 10 μl water containing 200 pmol oligo Cx (eg. C1) and the corresponding oligo cx (eg. c1). This is done so that the codon in oligo Cx corresponds to the BBcx added to the DNA identifier. 10 units of T4 DNA ligase (Promega) and 1.2 μl T4 DNA ligase buffer is then added to each tube and incubated at 20° C. for 1 hour. The DNAn identifier linked to the small molecules now comprises and Ax oligo with a Bx ligated to its 3′ end and a Cx oligo ligated to the 3′ end of the Bx oligo. The reactions are then pooled, the pooled sample volume is reduced by evaporation and the sample is purified by spining through Biospin P-6 columns (Biorad) equilibrated with water. The pooled sample (˜50 μl) is adjusted to 10 mM Na-acetate (pH 5). 0.25 volumes of 25 mM Iodine in tetrahydrofuran/water (1:1) is added and the sample is incubate at 37° C. for 2 h. The reaction is then quenched by addition of 2 μl of 1 M Na2S2O3 and incubation at RT for 5 min. Then the DNA identifiers (carrying small molecules) are purified by spinning through Biospin P-6 columns (Biorad) equilibrated with water and then lyophilized.
  • Final Deprotection Step
  • Some building blocks contain methyl esters that are deprotected to acids by dissolving the pooled sample in 5 μl 20 mM NaOH, heating to 80° C. for 10 minutes and adding 5 μl of 20 mM HCl.
  • Final Extension Step
  • To ensure that the DNA identifiers are double stranded prior to selection oligo d is extended along the identifier by adding to the sample 10 μl of 5× sequenase EX-buffer [100 mM Hepes, pH 7.5, 50 mM MgCl2, 750 mM NaCl] and 4000 pmol oligo d. Annealing is performed by heating to 80° C. and cooling to 20° C. To the sample is then added 500 μL dNTP, water to 50 μl and 39 units of Sequenase version 2.0 (USB). The reaction is incubated at 37° C. for 1 hour.
  • Selection
  • This library is subjected to selection, whereby binders to the selection target are enriched.
  • Maxisorp ELISA wells (NUNC A/S, Denmark) were coated with each 100 μL 2 μg/mL integrin αVβ3 in PBS buffer [2.8 mM NaH2PO4, 7.2 mM Na2HPO4, 0.15 M NaCl, pH 7.2]overnight at 4° C. Then the integrin solution was substituted for 200 μl blocking buffer [TBS, 0.05% Tween 20 (Sigma P-9416), 1% bovine serum alnumin (Sigma A-7030), 1 mM MnCl2] which was left on for 3 hours at room temperature. Then the wells were washed 10 times with blocking buffer and the encoded library was added to the wells after diluting it 100 times with blocking buffer. Following 2 hours incubation at room temperature the wells were washed 10 times with blocking buffer. After the final wash the wells were cleared of wash buffer and subsequently inverted and exposed to UV light at 300-350 nm for 30 seconds using a trans-illuminator set at 70% power. Then 100 μl blocking buffer without Tween-20 was immediately added to each well, the wells were shaken for 30 seconds, and the solutions containing eluted identifiers were removed for PCR amplification.
  • Cloning
  • A TOPO-TA (Invitrogen) ligation reaction is assembled with 4 μl PCR product, 1 μl salt solution (Invitrogen) and 1 μl vector. Water is added to 6 μl. The reaction is then incubated at RT for 30 min. Heat-shock competent TOP10 E. coli cells are then thawed on ice and 5 μl of the ligation reaction is added to the thawed cells. The cells are then incubated 30 min on ice, heatshocked in 42° C. water for 30 sec, and then put on ice again. 250 μl of growth medium is added to the cells and they are incubated 1 h at 37° C. The medium containting cells is then spread on a growth plate containing 100 μg/ml ampicillin and incubated at 37° C. for 16 hours.
  • Sequencing
  • Individual E. coli clones are then picked and transferred to PCR wells containing 50 μl water. These 50 μl were incubated at 94° C. for 5 minutes and used in a 20 μl in a 25 μl PCR reaction with 5 pmol of each TOPO primer M13 forward & M13 reverse and Ready-To-Go PCR beads (Amersham Biosciences). The following PCR profile is used: 94° C. 2 min, then 30×(94° C. 4 sec, 50° C. 30 sec, 72° C. 1 min) then 72° C. 10 min. Primers and nucleotides are then degraded by adding 1 μl 1:1 EXO/SAP mixture (USB corp.) to 2 μl PCR product and incubating at 37° C. for 15 min and then 80° C. for 15 min to heat-inactivate the enzymes. 5 pmol T7 primer is added and water is added to 12 μl. Then 8 μl DYEnamic ET cycle sequencing Terminator Mix (Applied biosystems) is added to each well. A thermocycling profile of 30×(95° C. 20 sec, 50° C. 15 sec, 60° C. 1 min) is then run. Then 10 μl water is added to each well and sequencing reactions are purified using seq96 spinplates (Amersham Biosciences). Reactions are then run on a MegaBace capillary electrophoresis instrument (Molecular Dynamics) using injection parameters 2 kV, 60 sec and run parameters: 9 kV 45 min and analyzed using Contig Express software (Informax).
  • The chemical entities used in each position are shown below.
    Position 1
    Building Block Smiles
    BB-A-000098
    Figure US20070026397A1-20070201-C00023
    BB-A-000112
    Figure US20070026397A1-20070201-C00024
    BB-A-000282
    Figure US20070026397A1-20070201-C00025
    BB-A-000283
    Figure US20070026397A1-20070201-C00026
  • Position 2
    BB-A-0099182
    Figure US20070026397A1-20070201-C00027
    BB-A-0000613
    Figure US20070026397A1-20070201-C00028
    BB-A-0000084
    Figure US20070026397A1-20070201-C00029
    BB-A-0000832
    Figure US20070026397A1-20070201-C00030
    BB-A-909683
    Figure US20070026397A1-20070201-C00031
    BB-A-0011001
    Figure US20070026397A1-20070201-C00032
    BB-A-0001183
    Figure US20070026397A1-20070201-C00033
    BB-A-0007821
    Figure US20070026397A1-20070201-C00034
    BB-A-0001182
    Figure US20070026397A1-20070201-C00035
    BB-A-0001182
    Figure US20070026397A1-20070201-C00036
    BB-A-0001614
    Figure US20070026397A1-20070201-C00037
    BB-A-0001642
    Figure US20070026397A1-20070201-C00038
    BB-A-0002382
    Figure US20070026397A1-20070201-C00039
    BB-A-0063143
    Figure US20070026397A1-20070201-C00040
    BB-A-0982162
    Figure US20070026397A1-20070201-C00041
    BB-A-0002182
    Figure US20070026397A1-20070201-C00042
    BB-A-0083172
    Figure US20070026397A1-20070201-C00043
    BB-A-0003182
    Figure US20070026397A1-20070201-C00044
    BB-A-0004183
    Figure US20070026397A1-20070201-C00045
    BB-A-0004183
    Figure US20070026397A1-20070201-C00046
    BB-A-0004282
    Figure US20070026397A1-20070201-C00047
    BB-A-0004282
    Figure US20070026397A1-20070201-C00048
    BB-A-0004222
    Figure US20070026397A1-20070201-C00049
    BB-A-0004222
    Figure US20070026397A1-20070201-C00050
    BB-A-0004342
    Figure US20070026397A1-20070201-C00051
  • Position 3
    BBA0000531
    Figure US20070026397A1-20070201-C00052
    BBA0001006
    Figure US20070026397A1-20070201-C00053
    BBA0001391
    Figure US20070026397A1-20070201-C00054
    BBA0001401
    Figure US20070026397A1-20070201-C00055
    BBA0008312
    Figure US20070026397A1-20070201-C00056
    BBA0008512
    Figure US20070026397A1-20070201-C00057
    BBA0008612
    Figure US20070026397A1-20070201-C00058
  • After the selection as described above, the codons in the identifier oligonucleotides were analysed. Before the analysis, the identifier oligonucleotides were amplified using the constant flanking regions and the amplified material was used in the identifier sequence analysis.
  • A sequence codon analysis of the selected codons showed a bias for specific chemical entities. They are listed in the table below. For instance, in position 1 chemical entity 98 was seem 47 times (out of 51 sequences, 92%, compare to 25% before the selection) and chemical entity 99 was seen 14 times (out 51 sequences,. 27%, compare to 4% before selection) and chemical entity 53 was seen 35 times (out of 51 sequences, 68%, compare to 14% before selection).
  • The chemical entities listed in the table below can then be used to generate a new and more focused library.
    Oligo(-s) Count pos 1 pos 2 pos 3
    BB-A-000098 47 98
    BB-A-000282 4 282
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    BBA0004242 6 424
    BBA0004182 5 418
    BBA0001101 2
    BBA0003172 2
    BBA0004212 2
    BBA0004232 2
    BBA000064 1
    BBA0001011 1
    BBA0003132 1
    BBA0003142 1
    BBA0003152 1
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    Figure US20070026397A1-20070201-P00899
    BBA0001006 4 100
    BBA0008512 2
    BBA0008312 1
  • The new focused library with the selected chemical entities can be selected against the target and the outcome from the selection can be analysed. The most abundant binders will be the combination between the chemical entities 98-99-53 and the second most abundant binder is 98-158-53 as shown below.
    Oligo(-s) Count pos 1 pos 2 pos 3
    BB-A-000098 BBA000099 BBA0000531 11 98
    Figure US20070026397A1-20070201-P00899
    53
    BB-A-000098 BBA0001582 BBA0000531 7 98 158 53
    BB-A-000098 BBA0004242 BBA0000531 4 98 424 53
    BB-A-000098 BBA0001582 BBA0001391 3 98 158 139
    BB-A-000098 BBA0004182 BBA0000531 3 98 418 53
    BB-A-000098 BBA000099 BBA0001391 2 98
    Figure US20070026397A1-20070201-P00899
    139
    BB-A-000098 BBA0001582 BBA0001006 2 98 158 100
  • This example exemplifies the possibility to reduce the library diversity by using the enriched chemical entities in a new library and perform another round of selection on the chosen chemical entities.
  • Example 7
  • The following experiment illustrates the principle of chemical entity (also termed building block herein) evolution through multiple rounds of library generation and selection. The experiment is not intended to limit the scope of the current invention.
  • Libraries were assembled by the combination of building blocks (BB) each of which was encoded by an oligonucleotide (oligo). Some of the building blocks carried an amine functional group and a carboxylic acid functional group. The building block amine was protected by N-pentenoylation and deprotected by iodine treatment prior to the reaction of the following building block. Oligonucleotide 1 (Oligo1) carried an amine functional group to allow reaction with the building block 1's carboxylic acid and oligonucleotides are optionally derivatized by phosphorylation to allow ligation. Oligonucleotide3 (oligo3) also comprised a primer region for PCR amplification. EDC/NHS, EDC/sulfoNHS or DMTMM was used as coupling reagents.
  • The following scheme describes the split and mix assembly of the libraries:
  • i.) n times [BB1+Oligo1→BB1-Oligo1] in separate wells
  • * Optionally purify product
  • ii.) mix all n wells into one tube
  • iii.) split product of ii.) into m separate wells
  • iv.) m times [BB2+BB1-Oligo1+Oligo2→BB2-BB1-Oligo1-Oligo2] in separate wells
  • * Optionally purify product
  • v.) mix all m wells into one tube
  • vi.) split product of v.) into p separate wells
  • vii.) p times [BB3+BB2-BB1-Oligo1-Oligo2+Oligo3→BB3-BB2-BB1-Oligo1-Oligo2-Oligo3] in separate wells
  • * Optionally purify product
  • viii.) mix all p wells into one tube
  • ix.) Selection was performed and binders isolated
  • x.) PCR of DNA and sequencing
  • xi.) Analyse for building block abundancy and full sequence information
  • Building block abundances analysis may be done by QPCR or by sequencing full sequences and then analyzing for the abundance of individual building blocks.
  • The following types of building blocks were used, wherein R describes a group which is varied for different building blocks:
    Figure US20070026397A1-20070201-C00059
  • The overall process leads to molecules of the following structure, where the oligonucleotide was double stranded.
    Figure US20070026397A1-20070201-C00060
  • The oligonucleotide was made double stranded by the use of double stranded Oligo's 1, 2 and 3 with an overhang to allow ligation of both strands.
  • Summary of the experimental outcome:
  • Two libraries of 61,875 members (Library 1 and 2) were generated as described in example 6 above and selected for binders of the Integrin αvβ3 receptor separately. The libraries were generated with 99 different building blocks in position 1, 25 different building blocks in position 2 and 25 different building blocks in position 3.
  • The identified sequences were then analyzed for the abundances of building blocks at each position in the sequence. The most abundant building blocks at each position from the two libraries 1 and 2 were then used again to generate a new and smaller library of 1,365 members, which was selected for binders of the Integrin αvβ3 receptor. The library was generated with 7 different building blocks in position 1, 13 different building blocks in position 2 and 15 different building blocks in position 3.
  • In the tables below, each of the building block numbers identify one specific building block or in two instances (library 1) a mixture of three different building blocks. The same numbers are used for each building block in all libraries, however the oligonucleotide used to identify each building block may not necessarily be the same between libraries to avoid potential problems of cross contamination.
  • The following tables describes the codon sequences and corresponding building blocks used. The codon is only indicated for one of the strands.
    Library 1, Position 1
    Codon Building
    Codon sequence Block
    no. ID ID
    1 TGTTC BBA000092
    2 CGAGC BBA000354
    3 GGATA BBA000085
    4 CGCTG BBA000086
    5 GTTAT BBA000098
    6 AGTGC BBA000099
    7 ACCTG BBA000089
    8 CTGGT BBA000090
    9 TAGGA BBA000087
    10 ACTCA BBA000088
    11 CTTAC BBA000153
    12 CGCAC BBA000154
    13 TCGCG BBA000059
    14 CGGAT BBA000152
    15 GAGAT BBA000101
    16 TGTAG BBA000110
    17 GTGTT BBA000112
    18 AGATG BBA000113
    19 ATCCT BBA000114
    20 TTGCT BBA000286
    21 ACGTA BBA000123
    22 ATCAC BBA000124
    23 TATCC BBA000155
    24 GGAAG BBA000156
    25 CGGTC BBA000158
    26 TGCTT BBA000159
    27 TTAGC BBA000160
    28 GCTGA BBA000161
    29 GAACG BBA000162
    30 CATGG BBA000163
    31 TGGTA BBA000165
    32 TCAAG BBA000166
    33 ATCGA BBA000167
    34 ATGCA BBA000168
    35 ACTAG BBA000169
    36 TACCT BBA000170
    37 TACGA BBA000171
    38 CTTCA BBA000172
    39 CTCTT BBA000173
    40 TCATC BBA000174
    41 ATTCC BBA000175
    42 CGACG BBA000176
    43 CCTGT BBA000177
    44 CCTTC BBA000178
    45 ACACC BBA000179
    46 TAACA BBA000180
    47 TAACA BBA000098
    48 CCAGG BBA000181
    49 ATGTC BBA000182
    50 GAGGA BBA000183
    51 GGTCA BBA000184
    52 GACTT BBA000185
    53 GGTGG BBA000186
    54 CAACT BBA000190
    55 ATGAG BBA000195
    56 TCTGC BBA000196
    57 ATAGG BBA000197
    58 CTACC BBA000198
    59 AAGTG BBA000201
    60 TCCAA BBA000202
    61 GCTCT BBA000203
    62 GGAGT BBA000204
    63 AATCG BBA000205
    64 AAGCT BBA000206
    65 CCGAA BBA000207
    66 TTTGT BBA000208
    67 CCGTG BBA000209
    68 TTTCG BBA000210
    69 TGAGG BBA000211
    70 GTTGC BBA000212
    71 AACTA BBA000112
    72 AACTA BBA000280
    73 CCTCG BBA000281
    74 AGCAA BBA000282
    75 TTCCA BBA000313
    76 AGACT BBA000314
    77 AGGTT BBA000315
    78 GCGTC BBA000316
    79 AACGT BBA000317
    80 CAAGA BBA000287
    81 AGAGA BBA000419
    82 GTACT BBA000420
    83 TAGAG BBA000421
    84 ACGAT BBA000422
    85 GACCA BBA000200
    86 TCGTT BBA000194
    87 GTCTC B8A000427
    88 CAGCA BBA000428
    89 TAGTC BBA000199
    90 GGGTG BBA000187
    91 CTCAG BBA000191
    92 AGAAC BBA000284
    93 GCGAG BBA000458
    94 GATGT BBA000459
    95 TCACT BBA000461
    96 CGTCT OBA000610
    97 AGCTC OBA000611
    98 CACTC OBA000609
    99 CAGTT OBA000615
  • Library 1, Position 2
    Codon Building
    Codon sequence Block
    no. ID ID
    1 AGTACGAACGTGCATCAGAG BBA000098
    2 TAGTCTCCTCCACTTCCATG BBA000099
    3 TACATCGTTCCAGACTACCG BBA000085
    4 TCCAGTGCAAGACTGAACAG BBA000153
    5 AGCATCACTACTCTGTCTGG BBA000206
    6 TCTTGTCAACCTTCCATGCG BBA000200
    7 AAGGACGTTCCTAGTAGGTG BBA000208
    8 GGAACCATCAAGATCCTGAG BBA000091
    9 ATCTCTGACGAGATCCAAGG BBA000090
    10 TCAAGGTTGGTGGTGTACTG BBA000092
    11 TCGAACTTGTTGCTTCCTCG BBA000123
    12 CTGAGTGTGTAGTACCAACG BBA000156
    13 ATCTTGGTTGTTCTCCTGCG BBA000163
    14 TAGTAGCTTGGAGTAGACCG BBA000197
    15 TTCACTCCATGCAGCATGTG BBA000083
    16 ACGATGGTGATCGATCAACG BBA000181
    17 TTCAGTGCTTGAGCTACCTG BBA000152
    18 TTGGACTCTTCTTGCACCAG BBA000088
    19 TCAACCAACTGGTTCTTGGG BBA000100
    20 TAGTACTCTACACTGCTGCG BBA00087 101 196
    21 TACACCATGACTTGCAGACG BBA00087 101 196
    22 GCATCTTGAGTCGTTGAACG BBA000059
    23 GACTCATCTCACTGGAGTTG BBA000124
    24 TCCAGCTTCTAGGAAGACAG BBA000160
    25 CTTCTTGAGTGCACTAGCAG BBA000201
  • Library 1, Position 3
    Codon Building
    Codon sequence Block
    no. ID ID
    1 CGAGCAGGACCTGGAACCTGGTGC BBA000098
    2 CTCGACCACTGCAGGTGGAGCTCC BBA000099
    3 CGTGCTTCCTCTGCTGCACCACCG BBA000085
    4 CCTGGTGTCGAGGTGAGCAGCAGC BBA000153
    5 CTCGACGAGGTCCATCCTGGTCGC BBA000206
    6 CGTGAGGAGCAGGTCCTCCTGTCG BBA000200
    7 CCTGACACTGGTCGTGGTCGAGGC BBA000208
    8 CCATCTCGACGACCTGCTCCTGGG BBA000091
    9 CCACGAGGTCTCCACTGGTCCAGG BBA000090
    10 CCACTGAGCTGCTCCTCCAGGTGG BBA000092
    11 CCTCCTGTCCTGCACGTCCATCCG BBA000123
    12 CAGCACCTGGAGGTAGGACCACGG BBA000156
    13 CGACCAGACGAGGACCAGGTAGGC BBA000163
    14 CCAGGTTCGAGGACCTCGTCAGCC BBA000197
    15 CGAGCACGAGGAGCACGTGTCCAG BBA000100
    16 CCACGTCCACAGGTGCACCAGGTG BBA000181
    17 CCTGGTGCTCCACGACGTGCTTCG BBA000152
    18 CACGTGACGACCTGGTCAGGTGGG BBA000088
    19 CGTAGCTCGTGCTGGTCCTCCTGG BBA000101
    20 CGACGACCACCACCTTGGACACCC BBA000196
    21 CCTACGTCGTGCTCACGTCCTGCC BBA00087
    22 CGACGACAGCTAGGAGGAGGTGGG BBA000083
    23 CTGGTGGAGCTGCACGAGCACAGC BBA000059
    24 CAGGACTGGACGACGACCAGGTCG BBA000124
    25 CGATGCTGCAGACGACCAGCACCC BBA000160
  • Library 2, Position 1
    Codon Building
    Codon sequence Block
    no. ID ID
    1 TGTTC BBA000092
    2 CGAGC BBA000354
    3 GGATA BBA000085
    4 CGCTG BBA000086
    5 GTTAT BBA000098
    6 AGTGC BBA000099
    7 ACCTG BBA000089
    8 CTGGT BBA000090
    9 TAGGA BBA000087
    10 ACTCA BBA000088
    11 CTTAC BBA000153
    12 CGCAC BBA000154
    13 TCGCG BBA000059
    14 CGGAT BBA000152
    15 GAGAT BBA000101
    16 TGTAG BBA000110
    17 GTGTT BBA000112
    18 AGATG BBA000113
    19 ATCCT BBA000114
    20 TTGCT BBA000286
    21 ACGTA BBA000123
    22 ATCAC BBA000124
    23 TATCC BBA000155
    24 GGAAG BBA000156
    25 CGGTC BBA000158
    26 TGCTT BBA000159
    27 TTAGC BBA000160
    28 GCTGA BBA000161
    29 GAACG BBA000162
    30 CATGG BBA000163
    31 TGGTA BBA000165
    32 TCAAG BBA000166
    33 ATCGA BBA000167
    34 ATGCA BBA000168
    35 ACTAG BBA000169
    36 TACCT BBA000170
    37 TACGA BBA000171
    38 CTTCA BBA000172
    39 CTCTT BBA000173
    40 TCATC BBA000174
    41 ATTCC BBA000175
    42 CGACG BBA000176
    43 CCTGT BBA000177
    44 CCTTC BBA000178
    45 ACACC BBA000179
    46 TAACA BBA000180
    47 TAACA BBA000098
    48 CCAGG BBA000181
    49 ATGTC BBA000182
    50 GAGGA BBA000183
    51 GGTCA BBA000184
    52 GACTT BBA000185
    53 GGTGG BBA000186
    54 CAACT BBA000190
    55 ATGAG BBA000195
    56 TCTGC BBA000196
    57 ATAGG BBA000197
    58 CTACC BBA0D0198
    59 AAGTG BBA000201
    60 TCCAA BBA000202
    61 GCTCT BBA000203
    62 GGAGT BBA000204
    63 AATCG BBA000205
    64 AAGCT BBA000206
    65 CCGAA BBA000207
    66 TTTGT BBA000208
    67 CCGTG BBA000209
    68 TTTCG BBA000210
    69 TGAGG BBA000211
    70 GTTGC BBA000212
    71 AACTA BBA000112
    72 AACTA BBA000280
    73 CCTCG BBA000281
    74 AGCAA BBA000282
    75 TTCCA BBA000313
    76 AGACT BBA000314
    77 AGGTT BBA000315
    78 GCGTC BBA000316
    79 AACGT BBA000317
    80 CAAGA BBA000287
    81 AGAGA BBA000419
    82 GTACT BBA000420
    83 TAGAG BBA000421
    84 ACGAT BBA000422
    85 GACCA BBA000200
    86 TCGTT BBA000194
    87 GTCTC BBA000427
    88 CAGCA BBA000428
    89 TAGTC BBA000199
    90 GGGTG BBA000187
    91 CTCAG BBA000191
    92 AGAAC BBA000284
    93 GCGAG BBA000458
    94 GATGT BBA000459
    95 TCACT BBA000461
    96 CGTCT OBA000610
    97 AGCTC OBA000611
    98 CACTC OBA000609
    99 CAGTT OBA000615
  • Library 2, Position 2
    Codon Building
    Codon sequence Block
    no. ID ID
    1 AGTACGAACGTGCATCAGAG BBA000059
    2 TAGTCTCCTCCACTTCCATG BBA000085
    3 TACATCGTTCCAGACTACCG BBA000098
    4 TCCAGTGCAAGACTGAACAG BBA000099
    5 AGCATCACTACTCTGTCTGG BBA000101
    6 TCTTGTCAACCTTCCATGCG BBA000110
    7 AAGGACGTTCCTAGTAGGTG BBA000113
    8 GGAACCATCAAGATCCTGAG BBA000114
    9 ATCTCTGACGAGATCCAAGG BBA000123
    10 TCAAGGTTGGTGGTGTACTG BBA000124
    11 TCGAACTTGTTGCTTCCTCG BBA000152
    12 CTGAGTGTGTAGTACCAACG BBA000158
    13 ATCTTGGTTGTTCTCCTGCG BBA000160
    14 TAGTAGCTTGGAGTAGACCG BBA000161
    15 TTCACTCCATGCAGCATGTG BBA000167
    16 ACGATGGTGATCGATCAACG BBA000176
    17 TTCAGTGCTTGAGCTACCTG BBA000181
    18 TTGGACTCTTCTTGCACCAG BBA000313
    19 TCAACCAACTGGTTCTTGGG BBA000314
    20 TAGTACTCTACACTGCTGCG BBA000315
    21 TACACCATGACTTGCAGACG BBA000316
    22 GCATCTTGAGTCGTTGAACG BBA000317
    23 GACTCATCTCACTGGAGTTG BBA000420
    24 TCCAGCTTCTAGGAAGACAG BBA000421
    25 CTTCTTGAGTGCACTAGCAG BBA000422
  • Library 2, Position 3
    Codon Building
    Codon sequence Block
    no. ID ID
    1 CGAGCAGGACCTGGAACCTGGTGC BBA000052
    2 CTCGACCACTGCAGGTGGAGCTCC BBA000053
    3 CGTGCTTCCTCTGCTGCACCACCG BBA000054
    4 CCTGGTGTCGAGGTGAGCAGCAGC BBA000056
    5 CTCGACGAGGTCCATCCTGGTCGC BBA000057
    6 CGTGAGGAGCAGGTCCTCCTGTCG BBA000058
    7 CCTGACACTGGTCGTGGTCGAGGC BBA000062
    8 CCATCTCGACGACCTGCTCCTGGG BBA000139
    9 CCACGAGGTCTCCACTGGTCCAGG BBA000140
    10 CCACTGAGCTGCTCCTCCAGGTGG BBA000100
    11 CCTCCTGTCCTGCACGTCCATCCG BBA000059
    12 CAGCACCTGGAGGTAGGACCACGG BBA000085
    13 CGACCAGACGAGGACCAGGTAGGC BBA000098
    14 CCAGGTTCGAGGACCTCGTCAGCC BBA000099
    15 CGAGCACGAGGAGCACGTGTCCAG BBA000101
    16 CCACGTCCACAGGTGCACCAGGTG BBA000110
    17 CCTGGTGCTCCACGACGTGCTTCG BBA000113
    18 CACGTGACGACCTGGTCAGGTGGG BBA000114
    19 CGTAGCTCGTGCTGGTCCTCCTGG BBA000123
    20 CGACGACCACCACCTTGGACACCC BBA000124
    21 CCTACGTCGTGCTCACGTCCTGCC BBA000152
    22 CGACGACAGCTAGGAGGAGGTGGG BBA000158
    23 CTGGTGGAGCTGCACGAGCACAGC BBA000160
    24 CAGGACTGGACGACGACCAGGTCG BBA000161
    25 CGATGCTGCAGACGACCAGCACCC BBA000167
  • Library 3, Position 1
    Codon Building More abundant
    Codon sequence Block in position 1
    no. ID ID in library no.
    1 TGTTC BBA000092 1
    2 ACTCA BBA000088 1
    3 CTTAC BBA000153 1 and 2
    4 CGGAT BBA000152 1
    5 ATTCC BBA000175 1 and 2
    6 GTCTC BBA000427 1
    7 ACAGT BBA000098 1 and 2
  • Library 3, Position 2
    Codon Building More abundant
    Codon sequence Block in position 2
    no. ID ID in library no.
    1 6CACAAGTACGAACGTGCATCAGAG BBA000059 1
    2 6CACATAGTCTCCTCCACTTCCATG BBA000083 1
    3 6CACATACATCGTTCCAGACTACCG BBA000085 2
    4 6CACATCCAGTGCAAGACTGAACAG BBA000088 1
    5 6CACAAGCATCACTACTCTGTCTGG BBA000090 1
    6 6CACATCTTGTCAACCTTCCATGCG BBA000099 1 and 2
    7 6CACAAAGGACGTTCCTAGTAGGTG BBA000110
    8 6CACAGGAACCATCAAGATCCTGAG BBA000114 2
    9 6CACAATCTCTGACGAGATCCAAGG BBA000152 2
    10 6CACATCAAGGTTGGTGGTGTACTG BBA000160 2
    11 6CACATCGAACTTGTTGCTTCCTCG BBA000200 1
    12 6CACACTGAGTGTGTAGTACCAACG BBA000201 1
    13 6CACAATCTTGGTTGTTCTCCTGCG BBA000422 2
  • Library 3, Position 3
    Codon Building More abundant
    Codon sequence Block in position 3
    no. ID ID in library no.
    1 6AGGACGAGCAGGACCTGGAACCTGGTGCGTTCCTCCACCACGTCTCCG BBA000053 2
    2 6AGGACTCGACCACTGCAGGTGGAGCTCCGTTCCTCCACCACGTCTCCG BBA000085 1
    3 6AGGACGTGCTTCCTCTGCTGCACCACCGGTTCCTCCACCACGTCTCCG BBA000087 1
    4 6AGGACCTGGTGTCGAGGTGAGCAGCAGCGTTCCTCCACCACGTCTCCG BBA000090 1
    5 6AGGACTCGACGAGGTCCATCCTGGTCGCGTTCCTCCACCACGTCTCCG BBA000091 1
    6 6AGGACGTGAGGAGCAGGTCCTCCTGTCGGTTCCTCCACCACGTCTCCG BBA000098 1
    7 6AGGACCTGACACTGGTCGTGGTCGAGGCGTTCCTCCACCACGTCTCCG BBA000100 1 and 2
    8 6AGGACCATCTCGACGACCTGCTCCTGGGGTTCCTCCACCACGTCTCCG BBA000139 2
    9 6AGGACCACGAGGTCTCCACTGGTCCAGGGTTCCTCCACCACGTCTCCG BBA000140 2
    10 6AGGACCACTGAGCTGCTCCTCCAGGTGGGTTCCTCCACCACGTCTCCG BBA000152
    11 6AGGACCTCCTGTCCTGCACGTCCATCCGGTTCCTCCACCACGTCTCCG BBA000153 1
    12 6AGGACAGCACCTGGAGGTAGGACCACGGGTTCCTCCACCACGTCTCCG BBA000161
    13 6AGGACGACCAGACGAGGACCAGGTAGGCGTTCCTCCACCACGTCTCCG BBA000167 2
    14 6AGGACCAGGTTCGAGGACCTCGTCAGCCGTTCCTCCACCACGTCTCCG BBA000197 1
    15 6AGGACGAGCACGAGGAGCACGTGTCCAGGTTCCTCCACCACGTCTCCG BBA000200 1
  • A subset of the isolated sequences from the library post selection was analysed:
    (1) GGCAGCACAGTCGTCGCACATACATCGTTCCAGACTACCGAGGAC
    CTGACACTGGTCGTGGTCGAGGCGTTCCT
    (2) GGCAGCACAGTCGTCGCTACATGCTTGTCAACCTTCCATGCGAGT
    ACCTTACACTGGTTCGTGGTCGAGGCGTTCCT
    (3) GGCAGCCGGAT423CGTCGCACATCTTGTCAACCTTCCATGCGAG
    GACCTGACACTGGTCGTGGTCGAGGCGTTCCT
    (4) GGCAGCCTTACGTCGCACAATTCTCTGACAGAAATCCAACGGAGG
    ACCTGACACGTGCGTCGTGGCTCGATGCGTTCCTC
    (5) GGCAGCACAGTCGTCGCACATCATTGTACAAACCTTCCATGCGAG
    GACCATCTCGACGACCTGCTCCTGGGGTNCCTC
    (6) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGACCTGCTCCTGGGGTTCCTC
    (7) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGACCTGCTCCTGGGGTTCCTC
    (8) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGACCTGCTCCTGGGGTTCCTC
    (9) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGACCTGCTCCTGGGGTTCCTC
    (10) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGAGCTGCTCCGGGGTTCCTC
    (11) GGCAGCACTAGATCGTCGCACATCTTGTCAACCTTCCATGCGAGG
    ACCATCTTCGACTGANCTGCCTCCTGTGGGCTTCCTC
    (12) GGCAGCACAGATCGTCGCACATCTTGTCAACCTTCCATGCGAGGA
    CCATCTCGACGANCTGCTCCTGGGGTTCCTC
    (13) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCACGACTACCTTGGCTCCCTGGGGTTCCTC
    (14) GGCAGCACAGTCGTCGCACATCTTGTCACCTTCCATGCGAGGACC
    ATCTCGACGACCTGCTCCTGGGGCCCTC
    (15) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGACCTGCTCCTGGGGTTCCTC
    (16) GGCAGCCGGATCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGACCTGCTCCTGGGGTTCCTC
    (17) GGCAGCCGGATCGTCGCACATCTTGTCACCTTCCATGCGAGGACC
    ATCTCGACGACCTGCTCCTGGGGTTCCTC
    (18) GGCAGCCGGATCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    CATCTCGACGACCTGCTCCTGGGGTTCCTC
    (19) GGCAGCACAGTCGTCGCAATCCAGTCAAGACTGAACAGAGGACCA
    TCTCGACGACCTGCTCCTGGGTT
    (20) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTTTCCATGCGAGG
    ACGAGCAGGACCTGGAACCTGGTGCGTTCCTC
    (21) GGCAGCACAGTCGTCGCACATCTTGTCACCTTCCATGCGAGGACG
    AGCAGGACCTGGAACCTGGTGCGTTCCTC
    (22) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    GATGCAGGACCTGGAACCTGGTGCGTTCCTC
    (23) GGCAGCCGGATCGTCGCAGATCTTGGTNAANCTTCCATGCGAGGA
    CGAGCATGAACTGGAACCTGGTGCGTTCCTC
    (24) GGCGGATCGTCGCACATCTTGTCAACCTTCCATGCGAGGACCACG
    AGGTCTCCACTGGTCCAGGGGTTCCTC
    (25) GGCAGCACAGTCGTCGGCACATCTTTGGTCAACCTTCCATGCGAG
    GACCACGAGGTCTCCACTGGTCCAGGGTTCCTC
    (26) GGCAGCCGGATCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    GACCAAGACGAGGACCAGGTAGGCGTTCCT
    (27) GGCAGCCGGAT423CGTCGCACATCTTGTCAACCTTCCATGCGAG
    GACGTGATGGAGCAAGTCCTCCTGTCGGTTCCTC
    (28) GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGAC
    ACGAGGTCTCCACTGGTCCAGGTTCCTC
    (29) GCCCAAACAAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGA
    CCGAGNNNGTAGCTGGANNCTCGGATGCGTTCCT
    (30) GCAGCACAGATCGTCGCACATGCTTGTCAAGCCTTTCCATCGCGA
    GGACCATCCTACGGAGCGAGCACTTGCTGCCTGGGGTTC
    (31) GGCAGCCGGATCGTCGCACATCAATGGTTTGGCTGGTGATACTGA
    GGACCACGACGTCTACACTTGGTTCCAGGGTTCCTC
  • These sequences could be translated into the following building block compositions:
    Sequence
    no. Postion 1 Position 2 Position 3
    1 BBA000098 BBA000085 BBA000100
    2 BBA000098 BBA000099 BBA000100
    3 BBA000152 BBA000099 BBA000100
    4 BBA000153 BBA000152 BBA000100
    5 BBA000098 BBA000099 BBA000139
    6 BBA000098 BBA000099 BBA000139
    7 BBA000098 BBA000099 BBA000139
    8 BBA000098 BBA000099 BBA000139
    9 BBA000098 BBA000099 BBA000139
    10 BBA000098 BBA000099 BBA000139
    11 BBA000098 BBA000099 BBA000139
    12 BBA000098 BBA000099 BBA000139
    13 BBA000098 BBA000099 BBA000139
    14 BBA000098 BBA000099 BBA000139
    15 BBA000098 BBA000099 BBA000139
    16 BBA000152 BBA000099 BBA000139
    17 BBA000152 BBA000099 BBA000139
    18 BBA000152 BBA000099 BBA000139
    19 BBA000098 BBA000088 BBA000139
    20 BBA000098 BBA000099 BBA000053
    21 BBA000098 BBA000099 BBA000053
    22 BBA000098 BBA000099 BBA000053
    23 BBA000152 BBA000099 BBA000053
    24 BBA000152 BBA000099 BBA000140
    25 BBA000098 BBA000099 BBA000140
    26 BBA000152 BBA000099 BBA000167
    27 BBA000152 BBA000099 BBA000098
    28 BBA000098 BBA000099 BBA000200
    29 BBA000098 BBA000099
    30 BBA000098 BBA000099
    31 BBA000152 BBA000160
  • In position 1 L-Asp (BBA00098) dominated. D-Asp was also found (BBA000152)
  • In position 2 Gly (BBA00099) dominated.
  • In position 3 building blocks carrying an amidine and no amine functionality was found to dominate:
    Figure US20070026397A1-20070201-C00061
  • The most abundant sequence was thereby found to correspond to the following structure:
    Figure US20070026397A1-20070201-C00062
  • The following 3 sequences
  • BBA000098-BBA000099-BBA000139
  • BBA000098-BBA000099-BBA000100
  • BBA000098-BBA000099-BBA000053
  • out of the 31 identified sequences were selected for further analysis using an standard ELISA assay and thereby-verified as binders of the αvβ3 Integrin receptor.
  • While the invention has been described with references to specific methods and embodiments, it will be appreciated that various modifications and changes may be made without departing from the invention. All patent and literature references cited herein are hereby incorporated by reference in their entirety.

Claims (33)

1. A method for producing a composition of molecules with an improved desired property, comprising the steps of:
i) providing an initial library comprising a plurality of different encoded molecules associated with a corresponding identifier nucleic acid sequence, wherein each encoded molecule comprises a reaction product of multiple chemical entities and the identifier nucleic acid sequence comprises codons identifying said chemical entities,
ii) subjecting the initial library to a condition partitioning members having encoded molecules displaying a predetermined property from the remainder of the initial library,
iii) identifying codons of the identifier nucleic acid sequences of the partitioned members of the initial library, and
iv) preparing a second-generation library of encoded molecules using the chemical entities coded for by the codons of the partitioned members of the initial library or a part thereof.
2. The method according to claim 1, wherein the second-generation library comprises a plurality of different encoded molecules associated with a corresponding identifier nucleic acid sequence, wherein each encoded molecule comprises a reaction product of multiple chemical entities and the identifier nucleic acid sequence comprises codons identifying said chemical entities.
3. The method of claim 1, further comprising subjecting the second generation library to a condition partitioning members having encoded molecules displaying a predetermined property from the remainder of the second generation library.
4-6. (canceled)
7. The method according to claim 1, wherein the encoded molecule is covalently associated with the corresponding identifier nucleic acid sequence.
8. (canceled)
9. The method according to claim 1, wherein the chemical entities are reacted without enzymatic interaction to produce the encoded molecule.
10. The method according to claim 1, wherein some or all chemical entities are not naturally occurring α-amino acids or precursors thereof.
11. The method according to claim 1, wherein the encoded molecule is not an a-polypeptide.
12. The method according to claim 1, wherein each codon comprises 4 or more nucleotides.
13. The method according to claim 1, wherein the codons are separated by a framing sequence.
14. The method according to claim 13, wherein the framing sequence positions the reaction of a chemical entity in the synthesis history of the encoded molecule.
15. (canceled)
16. The method according to claim 1, wherein the identifier nucleic acid sequence comprises three or more codons.
17. The method according to claim 1, wherein the identifier nucleic acid sequence is amplifiable and comprises codons identifying chemical entities, which have participated in the formation of the encoded molecule.
18. (canceled)
19. The method according to claim 1, wherein the encoded molecule has a molecular weight less than 2000 Dalton, preferably less than 1000 Dalton, and more preferred less than 500 Dalton.
20-21. (canceled)
22. The method according to claim 1, wherein identifier nucleic acid sequence prior to step iii) is amplified.
23. (canceled)
24. The method according to claim 1, wherein the codons of the identifier nucleic acid sequences of the partitioned members of the initial library are identified by contacting said identifier nucleic acid sequences with a pool of nucleic acid fragments under conditions allowing for hybridisation.
25-33. (canceled)
34. The method according to claim 24, wherein nucleic acid fragments are primer oligonucleotides, and the identification involves subjecting the hybridisation complex between the primer oligonucleotides and the identifier nucleic acid sequences to a condition allowing for an extension reaction to occur when the primer is sufficient complementary to a part of the identifier nucleic acid sequence, and evaluating based on measurement of the extension reaction, the presence, absence, or relative abundance of one or more codons.
35-47. (canceled)
48. The method according to claim 24, wherein the nucleic acid fragment is associated with a chemical entity precursor capable of being transferred to a recipient reactive group.
49-59. (canceled)
60. The method according to claim 1, wherein the second generation library is formed by
a) mixing under hybridisation conditions, nascent bifunctional complexes comprising a chemical entity or a reaction product of chemical entities, and an identifier nucleic acid sequence comprising codon(s) identifying said chemical entities, with the recovered nucleic acid fragments, said fragments comprising an oligonucleotide sufficient complementary to at least a part of the identifier nucleic acid sequence to allow for hybridisation, a transferable chemical entity and an anticodon identifying the chemical entity, to form hybridisation products,
b) transferring the chemical entities of the nucleic acid fragments to the nascent bifunctional complexes through a reaction involving a reactive group of the nascent bifunctional complex, in conjunction with a transfer of the genetic information of the anticodon.
61-64. (canceled)
65. The method according to, wherein the second generation library are subjected to a partitioning according to step ii) of claim 1.
66. The method according to claim 1, wherein, prior to the partitioning, the second generation library of complexes are contacted with sequences complementary to the identifier nucleic acid sequences, and the complexes which have hybridised with the complementary sequences are recovered.
67-68. (canceled)
69. The method according to claim 1, wherein the second-generation library is prepared using chemical entities appearing in the initial library and chemical entities foreign to the initial library.
70-75. (canceled)
US10/546,538 2003-02-21 2004-02-23 Method for producing second-generation library Abandoned US20070026397A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US10/546,538 US20070026397A1 (en) 2003-02-21 2004-02-23 Method for producing second-generation library
US13/179,283 US9096951B2 (en) 2003-02-21 2011-07-08 Method for producing second-generation library

Applications Claiming Priority (11)

Application Number Priority Date Filing Date Title
US44846003P 2003-02-21 2003-02-21
US44848003P 2003-02-21 2003-02-21
DKPA200300268 2003-02-21
DKPA200300268 2003-02-21
DKPA200300269 2003-02-21
DKPA200300269 2003-02-21
DKPA200301356 2003-09-18
DKPA200301356 2003-09-18
US50474803P 2003-09-22 2003-09-22
US10/546,538 US20070026397A1 (en) 2003-02-21 2004-02-23 Method for producing second-generation library
PCT/DK2004/000117 WO2004074429A2 (en) 2003-02-21 2004-02-23 Method for producing second-generation library

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK2004/000117 A-371-Of-International WO2004074429A2 (en) 2003-02-21 2004-02-23 Method for producing second-generation library

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/179,283 Continuation US9096951B2 (en) 2003-02-21 2011-07-08 Method for producing second-generation library

Publications (1)

Publication Number Publication Date
US20070026397A1 true US20070026397A1 (en) 2007-02-01

Family

ID=32913334

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/546,538 Abandoned US20070026397A1 (en) 2003-02-21 2004-02-23 Method for producing second-generation library
US13/179,283 Expired - Lifetime US9096951B2 (en) 2003-02-21 2011-07-08 Method for producing second-generation library

Family Applications After (1)

Application Number Title Priority Date Filing Date
US13/179,283 Expired - Lifetime US9096951B2 (en) 2003-02-21 2011-07-08 Method for producing second-generation library

Country Status (3)

Country Link
US (2) US20070026397A1 (en)
EP (1) EP1597395A2 (en)
WO (1) WO2004074429A2 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060121470A1 (en) * 2002-08-01 2006-06-08 Henrik Pedersen Multi-step synthesis of templated molecules
US20060127369A1 (en) * 2002-09-27 2006-06-15 Carlsberg A/S Spatially encoded polymer matrix
US20060292603A1 (en) * 2002-10-30 2006-12-28 Gouliaev Alex H Method for selecting a chemical entity from a tagged library
US20080305957A1 (en) * 2003-09-18 2008-12-11 Thomas Thisted Method for Obtaining Structural Information Concerning an Encoded Molecule and Method for Selecting Compounds
US20090143232A1 (en) * 2002-03-15 2009-06-04 Nuevolution A/S Method for synthesising templated molecules
US20090239211A1 (en) * 2004-02-17 2009-09-24 Nuevolution A/S Method For Enrichment Involving Elimination By Mismatch Hybridisation
US20090264300A1 (en) * 2005-12-01 2009-10-22 Nuevolution A/S Enzymatic encoding methods for efficient synthesis of large libraries
US20100016177A1 (en) * 2001-06-20 2010-01-21 Henrik Pedersen Templated molecules and methods for using such molecules
US9096951B2 (en) 2003-02-21 2015-08-04 Nuevolution A/S Method for producing second-generation library
US9121110B2 (en) 2002-12-19 2015-09-01 Nuevolution A/S Quasirandom structure and function guided synthesis methods
US9359601B2 (en) 2009-02-13 2016-06-07 X-Chem, Inc. Methods of creating and screening DNA-encoded libraries
US10865409B2 (en) 2011-09-07 2020-12-15 X-Chem, Inc. Methods for tagging DNA-encoded libraries
US11186836B2 (en) 2016-06-16 2021-11-30 Haystack Sciences Corporation Oligonucleotide directed and recorded combinatorial synthesis of encoded probe molecules
US11674135B2 (en) 2012-07-13 2023-06-13 X-Chem, Inc. DNA-encoded libraries having encoding oligonucleotide linkages not readable by polymerases
US11795580B2 (en) 2017-05-02 2023-10-24 Haystack Sciences Corporation Molecules for verifying oligonucleotide directed combinatorial synthesis and methods of making and using the same

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK2236606T3 (en) 2003-03-20 2014-02-10 Nuevolution As Coding of small molecules by ligation
AU2004299145B2 (en) 2003-12-17 2011-08-25 Glaxosmithkline Llc Methods for synthesis of encoded libraries
US7972994B2 (en) 2003-12-17 2011-07-05 Glaxosmithkline Llc Methods for synthesis of encoded libraries
EP1730277B1 (en) 2004-03-22 2009-10-28 Nuevolution A/S Ligational encoding using building block oligonucleotides
US7422855B2 (en) 2004-06-10 2008-09-09 Perkinelmer Las, Inc. Multiplexing assays for analyte detection
EP2365079A1 (en) * 2005-03-05 2011-09-14 Seegene, Inc. Processes using dual specificity oligonucleotide and dual specificity oligonucleotide
WO2006095941A1 (en) * 2005-03-05 2006-09-14 Seegene, Inc. Processes using dual specificity oligonucleotide and dual specificity oligonucleotide
WO2007016488A2 (en) * 2005-07-29 2007-02-08 Ensemble Discovery Corporation Analysis of encoded chemical libraries
WO2007053358A2 (en) 2005-10-28 2007-05-10 Praecis Pharmaceuticals, Inc. Methods for identifying compounds of interest using encoded libraries
CA2832672A1 (en) * 2010-04-16 2011-10-20 Nuevolution A/S Bi-functional complexes and methods for making and using such complexes
US10652656B2 (en) * 2017-03-01 2020-05-12 Mitsubishi Electric Corporation Digital signal processing device and audio device
CN107217309A (en) * 2017-07-07 2017-09-29 清华大学 Build the method and its application in the DNA sequencing library of testing gene group
US11384376B2 (en) 2018-05-31 2022-07-12 Roche Molecular Systems, Inc. Reagents and methods for post-synthetic modification of nucleic acids
AU2022292804A1 (en) * 2021-06-17 2024-01-18 Insitro, Inc. Methods of preparing bivalent molecules

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4822731A (en) * 1986-01-09 1989-04-18 Cetus Corporation Process for labeling single-stranded nucleic acids and hybridizaiton probes
US5476930A (en) * 1993-04-12 1995-12-19 Northwestern University Non-enzymatic ligation of oligonucleotides
US5503805A (en) * 1993-11-02 1996-04-02 Affymax Technologies N.V. Apparatus and method for parallel coupling reactions
US5571903A (en) * 1993-07-09 1996-11-05 Lynx Therapeutics, Inc. Auto-ligating oligonucleotide compounds
US5573905A (en) * 1992-03-30 1996-11-12 The Scripps Research Institute Encoded combinatorial chemical libraries
US5604097A (en) * 1994-10-13 1997-02-18 Spectragen, Inc. Methods for sorting polynucleotides using oligonucleotide tags
US5639603A (en) * 1991-09-18 1997-06-17 Affymax Technologies N.V. Synthesizing and screening molecular diversity
US5681943A (en) * 1993-04-12 1997-10-28 Northwestern University Method for covalently linking adjacent oligonucleotides
US5708153A (en) * 1991-09-18 1998-01-13 Affymax Technologies N.V. Method of synthesizing diverse collections of tagged compounds
US5741643A (en) * 1993-07-02 1998-04-21 Lynx Therapeutics, Inc. Oligonucleotide clamps
US5763175A (en) * 1995-11-17 1998-06-09 Lynx Therapeutics, Inc. Simultaneous sequencing of tagged polynucleotides
US5780613A (en) * 1995-08-01 1998-07-14 Northwestern University Covalent lock for self-assembled oligonucleotide constructs
US5830658A (en) * 1995-05-31 1998-11-03 Lynx Therapeutics, Inc. Convergent synthesis of branched and multiply connected macromolecular structures
US5843650A (en) * 1995-05-01 1998-12-01 Segev; David Nucleic acid detection and amplification by chemical linkage of oligonucleotides
US5846719A (en) * 1994-10-13 1998-12-08 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US6140489A (en) * 1994-10-13 2000-10-31 Lynx Therapeutics, Inc. Compositions for sorting polynucleotides
US6143503A (en) * 1998-04-17 2000-11-07 Whitehead Institute For Biomedical Research Use of a ribozyme to join nucleic acids and peptides
US6165778A (en) * 1993-11-02 2000-12-26 Affymax Technologies N.V. Reaction vessel agitation apparatus
US6207446B1 (en) * 1997-01-21 2001-03-27 The General Hospital Corporation Selection of proteins using RNA-protein fusions
US6297053B1 (en) * 1994-02-17 2001-10-02 Maxygen, Inc. Methods for generating polynucleotides having desired characteristics by iterative selection and recombination
US6416649B1 (en) * 1997-06-26 2002-07-09 Alcoa Inc. Electrolytic production of high purity aluminum using ceramic inert anodes
US6429300B1 (en) * 1999-07-27 2002-08-06 Phylos, Inc. Peptide acceptor ligation methods
US6593088B1 (en) * 1999-08-27 2003-07-15 Japan Science And Technology Corporation Reversible photocoupling nucleic acid and phosphoroamidite
US6620587B1 (en) * 1997-05-28 2003-09-16 Discerna Limited Ribosome complexes as selection particles for in vitro display and evolution of proteins

Family Cites Families (181)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6361943B1 (en) 1996-10-17 2002-03-26 Mitsubishi Chemical Corporation Molecule that homologizes genotype and phenotype and utilization thereof
US6040166A (en) 1985-03-28 2000-03-21 Roche Molecular Systems, Inc. Kits for amplifying and detecting nucleic acid sequences, including a probe
US5047519A (en) 1986-07-02 1991-09-10 E. I. Du Pont De Nemours And Company Alkynylamino-nucleotides
US5449602A (en) 1988-01-13 1995-09-12 Amoco Corporation Template-directed photoligation
US5025388A (en) 1988-08-26 1991-06-18 Cramer Richard D Iii Comparative molecular field analysis (CoMFA)
EP0446299A4 (en) 1988-11-18 1992-05-13 The Regents Of The University Of California Method for site-specifically incorporating unnatural amino acids into proteins
US5324829A (en) 1988-12-16 1994-06-28 Ortho Diagnostic Systems, Inc. High specific activity nucleic acid probes having target recognition and signal generating moieties
DE69032483T2 (en) 1989-10-05 1998-11-26 Optein Inc CELL-FREE SYNTHESIS AND ISOLATION OF GENES AND POLYPEPTIDES
CA2039517C (en) 1990-04-03 2006-11-07 David Segev Dna probe signal amplification
US5723289A (en) 1990-06-11 1998-03-03 Nexstar Pharmaceuticals, Inc. Parallel selex
US5723286A (en) 1990-06-20 1998-03-03 Affymax Technologies N.V. Peptide library and screening systems
US5650489A (en) 1990-07-02 1997-07-22 The Arizona Board Of Regents Random bio-oligomer library, a method of synthesis thereof, and a method of use thereof
WO1992002536A1 (en) 1990-08-02 1992-02-20 The Regents Of The University Of Colorado Systematic polypeptide evolution by reverse translation
US5843701A (en) 1990-08-02 1998-12-01 Nexstar Pharmaceticals, Inc. Systematic polypeptide evolution by reverse translation
US5432272A (en) 1990-10-09 1995-07-11 Benner; Steven A. Method for incorporating into a DNA or RNA oligonucleotide using nucleotides bearing heterocyclic bases
JPH08268B2 (en) 1991-04-05 1996-01-10 アイダエンジニアリング株式会社 Blank material feeding device for press machine
AU2313392A (en) 1991-08-01 1993-03-02 University Research Corporation Systematic polypeptide evolution by reverse translation
US5474796A (en) 1991-09-04 1995-12-12 Protogene Laboratories, Inc. Method and apparatus for conducting an array of chemical reactions on a support surface
JP2780572B2 (en) 1991-09-13 1998-07-30 株式会社島津製作所 Enzymatic synthesis of oligonucleotides and use of oligonucleotides as primers
US6197556B1 (en) 1991-12-20 2001-03-06 The University Of Chicago Nucleic acid amplification using modular branched primers
US5424413A (en) 1992-01-22 1995-06-13 Gen-Probe Incorporated Branched nucleic acid probes
US5541061A (en) 1992-04-29 1996-07-30 Affymax Technologies N.V. Methods for screening factorial chemical libraries
IL107166A (en) 1992-10-01 2000-10-31 Univ Columbia Complex combinatorial chemical libraries encoded with tags
US6503759B1 (en) 1992-10-01 2003-01-07 The Trustees Of Columbia University In The City Of New York Complex combinatorial chemical libraries encoded with tags
US5565324A (en) 1992-10-01 1996-10-15 The Trustees Of Columbia University In The City Of New York Complex combinatorial chemical libraries encoded with tags
US5684169A (en) 1992-11-27 1997-11-04 Ensuiko Sugar Refining Co., Ltd. Cyclodextrin inclusion complex of taxol, and method for its production and its use
WO1994013623A1 (en) 1992-12-11 1994-06-23 Chiron Corporation Synthesis of encoded polymers
US5840485A (en) 1993-05-27 1998-11-24 Selectide Corporation Topologically segregated, encoded solid phase libraries
EP0705279B1 (en) 1993-05-27 2003-02-19 Selectide Corporation Topologically segregated, encoded solid phase libraries
AU7323194A (en) 1993-07-02 1995-01-24 Lynx Therapeutics, Inc. Synthesis of branched nucleic acids
US6087186A (en) 1993-07-16 2000-07-11 Irori Methods and apparatus for synthesizing labeled combinatorial chemistry libraries
GB9315847D0 (en) 1993-07-30 1993-09-15 Isis Innovation Tag reagent and assay method
DK96093D0 (en) 1993-08-25 1993-08-25 Symbicom Ab IMPROVEMENTS IN MOLECULAR MODELING AND DRUG DESIGN
CN1525171A (en) 1993-10-01 2004-09-01 ŦԼ�и��ױ��Ǵ�ѧ���� Complex combinatorial chemical libraries encoded with tags
GB2298863B (en) 1993-11-02 1998-03-11 Affymax Tech Nv Apparatus and process for the synthesis of diverse compounds especially for generating and screening compound libraries
EP0736103A4 (en) 1993-12-17 1999-07-28 Roger S Cubicciotti Nucleotide-directed assembly of bimolecular and multimolecular drugs and devices
US7067326B2 (en) 1994-01-13 2006-06-27 The Trustees Of Columbia University In The City Of New York Synthetic receptors, libraries and uses thereof
US5449613A (en) 1994-03-01 1995-09-12 The University Of Iowa Research Foundation Reacting an enzyme in a non-aqueous solvent by adding a lyophilizate of enzyme and salt to the solvent
US6936477B2 (en) 1994-04-13 2005-08-30 The Trustees Of Columbia University In The City Of New York Complex combinatorial chemical libraries encoded with tags
US5643722A (en) 1994-05-11 1997-07-01 Trustees Of Boston University Methods for the detection and isolation of proteins
US5663046A (en) 1994-06-22 1997-09-02 Pharmacopeia, Inc. Synthesis of combinatorial libraries
EP0776330B1 (en) 1994-06-23 2003-08-20 Affymax Technologies N.V. Photolabile compounds and methods for their use
WO1996003418A1 (en) 1994-07-26 1996-02-08 The Scripps Research Institute Soluble combinatorial libraries
US5463564A (en) * 1994-09-16 1995-10-31 3-Dimensional Pharmaceuticals, Inc. System and method of automatically generating chemical compounds with desired properties
US5985356A (en) 1994-10-18 1999-11-16 The Regents Of The University Of California Combinatorial synthesis of novel materials
US6045671A (en) 1994-10-18 2000-04-04 Symyx Technologies, Inc. Systems and methods for the combinatorial synthesis of novel materials
WO1996024847A1 (en) 1995-02-10 1996-08-15 Smithkline Beecham Corporation A process for identifiying pharmaceutically active agents using an epitope-tagged library
US5679519A (en) 1995-05-09 1997-10-21 Oprandy; John J. Multi-label complex for enhanced sensitivity in electrochemiluminescence assay
US6210900B1 (en) 1995-05-23 2001-04-03 Smithkline Beecham Corporation Method of encoding a series of combinatorial libraries and developing structure activity relationships
US5824471A (en) 1995-06-05 1998-10-20 Brigham And Women's Hospital Detection of mismatches by cleavage of nucleic acid heteroduplexes
US5958792A (en) 1995-06-07 1999-09-28 Chiron Corporation Combinatorial libraries of substrate-bound cyclic organic compounds
HUP9900910A2 (en) 1995-06-07 1999-07-28 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
WO1997004131A1 (en) 1995-07-21 1997-02-06 Forsyth Dental Infirmary For Children Single primer amplification of polynucleotide hairpins
US5795976A (en) 1995-08-08 1998-08-18 The Board Of Trustees Of The Leland Stanford Junior University Detection of nucleic acid heteroduplex molecules by denaturing high-performance liquid chromatography and methods for comparative sequencing
US5723320A (en) 1995-08-29 1998-03-03 Dehlinger; Peter J. Position-addressable polynucleotide arrays
WO1997011958A1 (en) 1995-09-29 1997-04-03 The Scripps Research Institute Protein signature analysis
DE19646372C1 (en) 1995-11-11 1997-06-19 Evotec Biosystems Gmbh Conjugates of polypeptide and encoding nucleic acid
WO1997019039A1 (en) 1995-11-17 1997-05-29 Novartis Ag Solid phase synthesis of heterocyclic compounds and combinatorial compound library
US5763263A (en) 1995-11-27 1998-06-09 Dehlinger; Peter J. Method and apparatus for producing position addressable combinatorial libraries
US6537776B1 (en) 1999-06-14 2003-03-25 Diversa Corporation Synthetic ligation reassembly in directed evolution
US6613508B1 (en) 1996-01-23 2003-09-02 Qiagen Genomics, Inc. Methods and compositions for analyzing nucleic acid molecules utilizing sizing techniques
JP2002515738A (en) 1996-01-23 2002-05-28 アフィメトリックス,インコーポレイティド Nucleic acid analysis
US5880972A (en) 1996-02-26 1999-03-09 Pharmacopeia, Inc. Method and apparatus for generating and representing combinatorial chemistry libraries
JP2001519763A (en) 1996-03-22 2001-10-23 オントジエン・コーポレイシヨン A spatially distributed position coded combination library synthesis method
US6294325B1 (en) 1996-07-05 2001-09-25 The Mount Sinai School Of Medicine Of The City University Of New York Cloning and expression of thermostable multi genes and proteins and uses thereof
US5821356A (en) 1996-08-12 1998-10-13 The Perkin Elmer Corporation Propargylethoxyamino nucleotides
US6355490B1 (en) 1996-09-13 2002-03-12 Jill Edie Hochlowski Attached tags for use in combinatorial chemistry synthesis
DE19642751A1 (en) 1996-10-16 1998-04-23 Deutsches Krebsforsch Saccharide library
US5954874A (en) 1996-10-17 1999-09-21 Hunter; Charles Eric Growth of bulk single crystals of aluminum nitride from a melt
US6261804B1 (en) 1997-01-21 2001-07-17 The General Hospital Corporation Selection of proteins using RNA-protein fusions
US6969584B2 (en) 1997-06-12 2005-11-29 Rigel Pharmaceuticals, Inc. Combinatorial enzymatic complexes
WO1998058256A1 (en) 1997-06-16 1998-12-23 The University Of North Carolina At Chapel Hill PEPTIDO OLIGONUCLEOTIDES (PONs) AND THEIR COMBINATORIAL LIBRARIES
US6607878B2 (en) 1997-10-06 2003-08-19 Stratagene Collections of uniquely tagged molecules
US6348322B1 (en) 1997-10-17 2002-02-19 Duke University Method of screening for specific binding interactions
US20030004122A1 (en) 1997-11-05 2003-01-02 Leonid Beigelman Nucleotide triphosphates and their incorporation into oligonucleotides
US6232066B1 (en) 1997-12-19 2001-05-15 Neogen, Inc. High throughput assay system
AU2871299A (en) 1998-02-21 1999-09-06 Alan W. Schwabacher One dimensional chemical compound arrays and methods for assaying them
US6316616B1 (en) 1998-04-02 2001-11-13 President And Fellows Of Harvard College Parallel combinatorial approach to the discovery and optimization of catalysts and uses thereof
WO1999051773A1 (en) 1998-04-03 1999-10-14 Phylos, Inc. Addressable protein arrays
ATE256142T1 (en) 1998-05-15 2003-12-15 Isis Innovation LIBRARIES OF DIFFERENTLY MARKED OLIGOMERS
US6287765B1 (en) 1998-05-20 2001-09-11 Molecular Machines, Inc. Methods for detecting and identifying single molecules
US5948648A (en) 1998-05-29 1999-09-07 Khan; Shaheer H. Nucleotide compounds including a rigid linker
US6096875A (en) 1998-05-29 2000-08-01 The Perlein-Elmer Corporation Nucleotide compounds including a rigid linker
JP2002517474A (en) 1998-06-10 2002-06-18 グリコデザイン インコーポレイテッド Directed combinatorial compound libraries and high-throughput assays for screening them
AU6502599A (en) 1998-10-05 2000-04-26 Lynx Therapeutics, Inc. Enzymatic synthesis of oligonucleotide tags
WO2000021909A2 (en) 1998-10-09 2000-04-20 Pharmacopeia, Inc. Selecting codes to be used for encoding combinatorial libraries
US6175001B1 (en) 1998-10-16 2001-01-16 The Scripps Research Institute Functionalized pyrimidine nucleosides and nucleotides and DNA's incorporating same
CA2346989A1 (en) 1998-10-19 2000-04-27 The Board Of Trustees Of The Leland Stanford Junior University Dna-templated combinatorial library chemistry
EP1124948A1 (en) 1998-10-28 2001-08-22 Novozymes A/S Method for generating a gene library
US5942609A (en) 1998-11-12 1999-08-24 The Porkin-Elmer Corporation Ligation assembly and detection of polynucleotides on solid-support
ES2280131T3 (en) 1998-12-02 2007-09-01 Adnexus Therapeutics, Inc. DNA-PROTEIN FUSIONS AND USES OF THE SAME.
AU2225000A (en) 1999-01-08 2000-07-24 Ceres, Inc. Sequence-determined dna fragments and corresponding polypeptides encoded thereby
CA2403209A1 (en) 1999-04-08 2000-10-19 Pavel V. Sergeev Synthesis of biologically active compounds in cells
EP2360270B1 (en) 1999-05-20 2016-11-09 Illumina, Inc. Combinatorial decoding of random nucleic acid arrays
AU784040B2 (en) 1999-06-25 2006-01-19 Nanosphere, Inc. Nanoparticles having oligonucleotides attached thereto and uses therefor
GB9920194D0 (en) 1999-08-27 1999-10-27 Advanced Biotech Ltd A heat-stable thermostable DNA polymerase for use in nucleic acid amplification
US20020048760A1 (en) 1999-12-10 2002-04-25 Hyseq, Inc. Use of mismatch cleavage to detect complementary probes
ATE322558T1 (en) 2000-01-24 2006-04-15 Compound Therapeutics Inc SENSITIVE AND MULTIPLEX DIAGNOSTIC TESTS FOR PROTEIN ANALYSIS
EP1252126A4 (en) 2000-02-03 2006-07-05 Nanoscale Combinatorial Synthe Nonredundant split/pool synthesis of combinatorial libraries
US20020127598A1 (en) 2000-03-27 2002-09-12 Wenqiang Zhou Solution-phase combinatorial library synthesis and pharmaceutically active compounds produced thereby
US7682837B2 (en) 2000-05-05 2010-03-23 Board Of Trustees Of Leland Stanford Junior University Devices and methods to form a randomly ordered array of magnetic beads and uses thereof
AU6177101A (en) 2000-05-19 2001-12-03 Richard B Williams In vitro evolution of nucleic acids and encoded polypeptide
US7244560B2 (en) 2000-05-21 2007-07-17 Invitrogen Corporation Methods and compositions for synthesis of nucleic acid molecules using multiple recognition sites
ES2281424T3 (en) 2000-06-05 2007-10-01 Novartis Vaccines And Diagnostics, Inc. MICROMATRICES TO CARRY OUT PROTEOMIC ANALYSIS.
US20020115068A1 (en) 2000-06-23 2002-08-22 Ian Tomlinson Matrix screening method
JP2004502705A (en) 2000-07-03 2004-01-29 メルク エンド カムパニー インコーポレーテッド Coding methods in combinatorial libraries
US6811977B2 (en) 2000-07-27 2004-11-02 California Institute Of Technology Rapid, quantitative method for the mass spectrometric analysis of nucleic acids for gene expression and genotyping
US20020072887A1 (en) 2000-08-18 2002-06-13 Sandor Szalma Interaction fingerprint annotations from protein structure models
WO2004007529A2 (en) 2002-07-15 2004-01-22 The Trustees Of Princeton University Iap binding compounds
US20020146723A1 (en) 2000-10-25 2002-10-10 Krontiris Theodore G. Candidate region mismatch scanning for genotyping and mutation detection
JP2002315577A (en) 2000-11-14 2002-10-29 Gencom Co Method for constructing nucleic acid library
EP2332896A3 (en) * 2001-03-19 2012-09-26 President and Fellows of Harvard College Evolving new molecular function
AU2002257076A1 (en) 2001-03-19 2002-10-03 President And Fellows Of Harvard College Nucleic acid shuffling
WO2002083951A1 (en) 2001-04-10 2002-10-24 Northeastern University Multiplexed ligand/protein binding assays with pna labels
US7838270B2 (en) 2001-05-22 2010-11-23 The University Of Chicago Target-dependent transcription using deletion mutants of N4 RNA polymerase
AU2002312389A1 (en) 2001-06-05 2002-12-16 Irm Llc Functional proteomic profiling
CN1539014A (en) * 2001-06-20 2004-10-20 ŦΤ¬ɭ��˾ Templated molecules and methods for using such molecules
US7727713B2 (en) 2001-06-20 2010-06-01 Nuevolution A/S Templated molecules and methods for using such molecules
US20060234231A1 (en) 2001-06-20 2006-10-19 Nuevolution A/S Microarrays displaying encoded molecules
US20040161741A1 (en) 2001-06-30 2004-08-19 Elazar Rabani Novel compositions and processes for analyte detection, quantification and amplification
DE10145226A1 (en) 2001-09-13 2003-04-10 Lifebits Ag Manufacture of carrier-bound molecules
CA2466164A1 (en) 2001-10-30 2003-05-08 Nanomics Biosystems Pty, Ltd. Device and methods for directed synthesis of chemical libraries
WO2003062417A1 (en) 2002-01-22 2003-07-31 Mitsubishi Chemical Corporation Rna-dna ligation product and utilization thereof
ATE424561T1 (en) 2002-03-08 2009-03-15 Eidgenoess Tech Hochschule CODED, SELF-ASSEMBLING CHEMICAL LIBRARIES (ESACHEL)
AU2003253069A1 (en) 2002-03-15 2003-09-29 Nuevolution A/S A building block forming a c-c bond upon reaction
EP1487850A2 (en) 2002-03-15 2004-12-22 Nuevolution A/S A building block forming a c-c or a c-hetero atom bond upon reaction
EP1487848A2 (en) 2002-03-15 2004-12-22 Nuevolution A/S A building block forming a c=c double bond upon reaction
WO2003078626A2 (en) 2002-03-15 2003-09-25 Nuevolution A/S A building block capable of transferring a functional entity
NZ535144A (en) 2002-03-15 2006-03-31 Nuevolution As An improved method for synthesising templated molecules
AU2003226008A1 (en) 2002-03-22 2003-10-13 Emory University Template-driven processes for synthesizing polymers and components related to such processes
GB0213816D0 (en) 2002-06-14 2002-07-24 Univ Aston Method of producing DNA and protein libraries
AU2003240436A1 (en) 2002-06-20 2004-01-06 Nuevolution A/S Microarrays displaying encoded molecules
EP1527173A1 (en) 2002-07-23 2005-05-04 Nuevolution A/S Gene shuffing by template switching
EP1539980B1 (en) 2002-08-01 2016-02-17 Nuevolution A/S Library of complexes comprising small non-peptide molecules and double-stranded oligonucleotides identifying the molecules
AU2003263937B2 (en) 2002-08-19 2010-04-01 The President And Fellows Of Harvard College Evolving new molecular function
US20040197845A1 (en) 2002-08-30 2004-10-07 Arjang Hassibi Methods and apparatus for pathogen detection, identification and/or quantification
EP1539953A2 (en) 2002-09-12 2005-06-15 Nuevolution A/S Proximity-aided synthesis of templated molecules
JP2006500959A (en) 2002-09-30 2006-01-12 パラレル バイオサイエンス, インコーポレイテッド Polynucleotide synthesis and labeling by dynamic sampling binding
DK3299463T3 (en) 2002-10-30 2020-12-07 Nuevolution As ENZYMATIC CODING
AU2003291677A1 (en) 2002-10-30 2004-05-25 Pointilliste, Inc. Methods for producing polypeptide-tagged collections and capture systems containing the tagged polypeptides
US9121110B2 (en) 2002-12-19 2015-09-01 Nuevolution A/S Quasirandom structure and function guided synthesis methods
WO2005003375A2 (en) 2003-01-29 2005-01-13 454 Corporation Methods of amplifying and sequencing nucleic acids
US20060269920A1 (en) 2003-02-21 2006-11-30 Nuevolution A/S Method for obtaining structural information about an encoded molecule
EP1597395A2 (en) 2003-02-21 2005-11-23 Nuevolution A/S Method for producing second-generation library
JP4054871B2 (en) 2003-02-24 2008-03-05 独立行政法人産業技術総合研究所 Thermostable DNA ligase
DK2236606T3 (en) 2003-03-20 2014-02-10 Nuevolution As Coding of small molecules by ligation
WO2004099441A2 (en) 2003-05-09 2004-11-18 Hyscite Discovery As Selection and evolution of chemical libraries
WO2004110964A2 (en) 2003-06-16 2004-12-23 Nuevolution A/S Encoded molecules by translation (emt)
WO2005003778A2 (en) 2003-07-02 2005-01-13 Nuevolution A/S A method for identifying a synthetic molecule having affinity towards a target
US20070134662A1 (en) 2003-07-03 2007-06-14 Juswinder Singh Structural interaction fingerprint
EP1533385A1 (en) 2003-09-05 2005-05-25 Nuevolution A/S Templated compounds for generation of encoded molecules and directional methods using such compounds
ATE447626T1 (en) 2003-09-18 2009-11-15 Nuevolution As METHOD FOR OBTAINING STRUCTURAL INFORMATION FROM ENCODED MOLECULES AND FOR SELECTING COMPOUNDS
US7972994B2 (en) 2003-12-17 2011-07-05 Glaxosmithkline Llc Methods for synthesis of encoded libraries
AU2004299145B2 (en) 2003-12-17 2011-08-25 Glaxosmithkline Llc Methods for synthesis of encoded libraries
US20090239211A1 (en) 2004-02-17 2009-09-24 Nuevolution A/S Method For Enrichment Involving Elimination By Mismatch Hybridisation
EP1730277B1 (en) 2004-03-22 2009-10-28 Nuevolution A/S Ligational encoding using building block oligonucleotides
WO2005116213A2 (en) 2004-04-15 2005-12-08 President And Fellows Of Harvard College Directed evolution of proteins
CN101056980B (en) 2004-11-08 2012-05-23 威泊根私人有限公司 Structural nucleid acid guided chemical synthesis
ATE420170T1 (en) 2004-11-22 2009-01-15 Peter Birk Rasmussen MATTRICE-DIRECTED SPLIT-AND-MIX SYNTHESIS OF SMALL MOLECULE LIBRARIES
ATE477254T1 (en) 2004-12-20 2010-08-15 Genentech Inc PYRROLIDINES AS INHIBITORS OF IAP
JP4969459B2 (en) 2005-01-21 2012-07-04 プレジデント アンド フェロウズ オブ ハーバード カレッジ Free reactants used in nucleic acid template synthesis
US7968289B2 (en) 2005-05-03 2011-06-28 Ensemble Therapeutics Corporation Turn over probes and use thereof for nucleic acid detection
BRPI0611474A2 (en) 2005-05-26 2010-09-14 Ensemble Discovery Corp nucleic acid modeled chemical biodetection
WO2006130669A2 (en) 2005-05-31 2006-12-07 Ensemble Discovery Corporation Anchor-assisted fragment selection and directed assembly
WO2006135654A2 (en) 2005-06-07 2006-12-21 President And Fellows Of Harvard College Polymer evolution via templated synthesis related applications
ATE461279T1 (en) 2005-06-07 2010-04-15 Harvard College ORDERED MULTI-STEP SYNTHESIS USING NUCLEIC ACID-MEDIATED CHEMISTRY
EP2338990A3 (en) 2005-06-09 2011-10-19 Praecis Pharmaceuticals Inc. Methods for synthesis of encoded libraries
WO2006138666A2 (en) 2005-06-17 2006-12-28 President And Fellows Of Harvard College Iterated branching reaction pathways via nucleic acid-mediated chemistry
US20090035824A1 (en) 2005-06-17 2009-02-05 Liu David R Nucleic acid-templated chemistry in organic solvents
WO2007011722A2 (en) 2005-07-15 2007-01-25 President And Fellows Of Harvard College Reaction discovery system
WO2007016488A2 (en) 2005-07-29 2007-02-08 Ensemble Discovery Corporation Analysis of encoded chemical libraries
WO2007053358A2 (en) 2005-10-28 2007-05-10 Praecis Pharmaceuticals, Inc. Methods for identifying compounds of interest using encoded libraries
ATE490318T1 (en) 2005-12-01 2010-12-15 Nuevolution As ENZYME-MEDIATING CODING METHODS FOR EFFICIENT SYNTHESIS OF LARGE LIBRARIES
WO2007124758A1 (en) 2006-05-03 2007-11-08 Vipergen Aps A method for preparing compounds by nucleic acid directed synthesis
US20100143499A1 (en) 2006-07-24 2010-06-10 Tetralogic Pharmaceuticals Corporation Dimeric iap inhibitors
DK2064348T3 (en) 2006-09-18 2012-05-29 Ensemble Therapeutics Corp Receptor family profiling
CA2664649A1 (en) 2006-09-28 2008-05-08 Ensemble Discovery Corporation Compositions and methods for biodetection by nucleic acid-templated chemistry
WO2009018003A2 (en) 2007-07-27 2009-02-05 Ensemble Discovery Corporation Detection assays and use thereof
WO2009077173A2 (en) 2007-12-19 2009-06-25 Philochem Ag Dna-encoded chemical libraries
TW201011006A (en) 2008-06-16 2010-03-16 Nuevolution As IAP binding compounds
CA2832672A1 (en) 2010-04-16 2011-10-20 Nuevolution A/S Bi-functional complexes and methods for making and using such complexes

Patent Citations (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4822731A (en) * 1986-01-09 1989-04-18 Cetus Corporation Process for labeling single-stranded nucleic acids and hybridizaiton probes
US5639603A (en) * 1991-09-18 1997-06-17 Affymax Technologies N.V. Synthesizing and screening molecular diversity
US5708153A (en) * 1991-09-18 1998-01-13 Affymax Technologies N.V. Method of synthesizing diverse collections of tagged compounds
US6165717A (en) * 1991-09-18 2000-12-26 Affymax Technologies N.V. Method of synthesizing diverse collections of oligomers
US5770358A (en) * 1991-09-18 1998-06-23 Affymax Technologies N.V. Tagged synthetic oligomer libraries
US6140493A (en) * 1991-09-18 2000-10-31 Affymax Technologies N.V. Method of synthesizing diverse collections of oligomers
US6060596A (en) * 1992-03-30 2000-05-09 The Scripps Research Institute Encoded combinatorial chemical libraries
US5573905A (en) * 1992-03-30 1996-11-12 The Scripps Research Institute Encoded combinatorial chemical libraries
US5723598A (en) * 1992-03-30 1998-03-03 The Scripps Research Institute Encoded combinatorial chemical libraries
US5681943A (en) * 1993-04-12 1997-10-28 Northwestern University Method for covalently linking adjacent oligonucleotides
US5476930A (en) * 1993-04-12 1995-12-19 Northwestern University Non-enzymatic ligation of oligonucleotides
US5741643A (en) * 1993-07-02 1998-04-21 Lynx Therapeutics, Inc. Oligonucleotide clamps
US5571903A (en) * 1993-07-09 1996-11-05 Lynx Therapeutics, Inc. Auto-ligating oligonucleotide compounds
US5665975A (en) * 1993-11-02 1997-09-09 Affymax Technologies N.V. Optical detectior including an optical alignment block and method
US6056926A (en) * 1993-11-02 2000-05-02 Affymax Technologies N.V. Apparatus and method for parallel coupling reactions
US5503805A (en) * 1993-11-02 1996-04-02 Affymax Technologies N.V. Apparatus and method for parallel coupling reactions
US6165778A (en) * 1993-11-02 2000-12-26 Affymax Technologies N.V. Reaction vessel agitation apparatus
US6297053B1 (en) * 1994-02-17 2001-10-02 Maxygen, Inc. Methods for generating polynucleotides having desired characteristics by iterative selection and recombination
US6140489A (en) * 1994-10-13 2000-10-31 Lynx Therapeutics, Inc. Compositions for sorting polynucleotides
US5846719A (en) * 1994-10-13 1998-12-08 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US5635400A (en) * 1994-10-13 1997-06-03 Spectragen, Inc. Minimally cross-hybridizing sets of oligonucleotide tags
US5604097A (en) * 1994-10-13 1997-02-18 Spectragen, Inc. Methods for sorting polynucleotides using oligonucleotide tags
US6352828B1 (en) * 1994-10-13 2002-03-05 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US6150516A (en) * 1994-10-13 2000-11-21 Lynx Therapeutics, Inc. Kits for sorting and identifying polynucleotides
US5654413A (en) * 1994-10-13 1997-08-05 Spectragen, Inc. Compositions for sorting polynucleotides
US6172214B1 (en) * 1994-10-13 2001-01-09 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US6235475B1 (en) * 1994-10-13 2001-05-22 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US5843650A (en) * 1995-05-01 1998-12-01 Segev; David Nucleic acid detection and amplification by chemical linkage of oligonucleotides
US5830658A (en) * 1995-05-31 1998-11-03 Lynx Therapeutics, Inc. Convergent synthesis of branched and multiply connected macromolecular structures
US5780613A (en) * 1995-08-01 1998-07-14 Northwestern University Covalent lock for self-assembled oligonucleotide constructs
US5763175A (en) * 1995-11-17 1998-06-09 Lynx Therapeutics, Inc. Simultaneous sequencing of tagged polynucleotides
US6207446B1 (en) * 1997-01-21 2001-03-27 The General Hospital Corporation Selection of proteins using RNA-protein fusions
US6620587B1 (en) * 1997-05-28 2003-09-16 Discerna Limited Ribosome complexes as selection particles for in vitro display and evolution of proteins
US6416649B1 (en) * 1997-06-26 2002-07-09 Alcoa Inc. Electrolytic production of high purity aluminum using ceramic inert anodes
US6143503A (en) * 1998-04-17 2000-11-07 Whitehead Institute For Biomedical Research Use of a ribozyme to join nucleic acids and peptides
US6429300B1 (en) * 1999-07-27 2002-08-06 Phylos, Inc. Peptide acceptor ligation methods
US6593088B1 (en) * 1999-08-27 2003-07-15 Japan Science And Technology Corporation Reversible photocoupling nucleic acid and phosphoroamidite

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100016177A1 (en) * 2001-06-20 2010-01-21 Henrik Pedersen Templated molecules and methods for using such molecules
US10669538B2 (en) 2001-06-20 2020-06-02 Nuevolution A/S Templated molecules and methods for using such molecules
US8932992B2 (en) 2001-06-20 2015-01-13 Nuevolution A/S Templated molecules and methods for using such molecules
US10731151B2 (en) 2002-03-15 2020-08-04 Nuevolution A/S Method for synthesising templated molecules
US20090143232A1 (en) * 2002-03-15 2009-06-04 Nuevolution A/S Method for synthesising templated molecules
US8808984B2 (en) 2002-03-15 2014-08-19 Neuvolution A/S Method for synthesising templated molecules
US10730906B2 (en) 2002-08-01 2020-08-04 Nuevolutions A/S Multi-step synthesis of templated molecules
US20060121470A1 (en) * 2002-08-01 2006-06-08 Henrik Pedersen Multi-step synthesis of templated molecules
US8791053B2 (en) 2002-09-27 2014-07-29 Mpm-Holding Aps Spatially encoded polymer matrix
US20060127369A1 (en) * 2002-09-27 2006-06-15 Carlsberg A/S Spatially encoded polymer matrix
US10077440B2 (en) 2002-10-30 2018-09-18 Nuevolution A/S Method for the synthesis of a bifunctional complex
US8206901B2 (en) 2002-10-30 2012-06-26 Nuevolution A/S Method for the synthesis of a bifunctional complex
US8722583B2 (en) 2002-10-30 2014-05-13 Nuevolution A/S Method for selecting a chemical entity from a tagged library
US11001835B2 (en) 2002-10-30 2021-05-11 Nuevolution A/S Method for the synthesis of a bifunctional complex
US9109248B2 (en) 2002-10-30 2015-08-18 Nuevolution A/S Method for the synthesis of a bifunctional complex
US9284600B2 (en) 2002-10-30 2016-03-15 Neuvolution A/S Method for the synthesis of a bifunctional complex
US20060292603A1 (en) * 2002-10-30 2006-12-28 Gouliaev Alex H Method for selecting a chemical entity from a tagged library
US9121110B2 (en) 2002-12-19 2015-09-01 Nuevolution A/S Quasirandom structure and function guided synthesis methods
US9096951B2 (en) 2003-02-21 2015-08-04 Nuevolution A/S Method for producing second-generation library
US20080305957A1 (en) * 2003-09-18 2008-12-11 Thomas Thisted Method for Obtaining Structural Information Concerning an Encoded Molecule and Method for Selecting Compounds
US11118215B2 (en) 2003-09-18 2021-09-14 Nuevolution A/S Method for obtaining structural information concerning an encoded molecule and method for selecting compounds
US20090239211A1 (en) * 2004-02-17 2009-09-24 Nuevolution A/S Method For Enrichment Involving Elimination By Mismatch Hybridisation
US9574189B2 (en) 2005-12-01 2017-02-21 Nuevolution A/S Enzymatic encoding methods for efficient synthesis of large libraries
US20090264300A1 (en) * 2005-12-01 2009-10-22 Nuevolution A/S Enzymatic encoding methods for efficient synthesis of large libraries
US11702652B2 (en) 2005-12-01 2023-07-18 Nuevolution A/S Enzymatic encoding methods for efficient synthesis of large libraries
US9359601B2 (en) 2009-02-13 2016-06-07 X-Chem, Inc. Methods of creating and screening DNA-encoded libraries
US11168321B2 (en) 2009-02-13 2021-11-09 X-Chem, Inc. Methods of creating and screening DNA-encoded libraries
US10865409B2 (en) 2011-09-07 2020-12-15 X-Chem, Inc. Methods for tagging DNA-encoded libraries
US11674135B2 (en) 2012-07-13 2023-06-13 X-Chem, Inc. DNA-encoded libraries having encoding oligonucleotide linkages not readable by polymerases
US11186836B2 (en) 2016-06-16 2021-11-30 Haystack Sciences Corporation Oligonucleotide directed and recorded combinatorial synthesis of encoded probe molecules
US11795580B2 (en) 2017-05-02 2023-10-24 Haystack Sciences Corporation Molecules for verifying oligonucleotide directed combinatorial synthesis and methods of making and using the same

Also Published As

Publication number Publication date
WO2004074429A2 (en) 2004-09-02
US9096951B2 (en) 2015-08-04
WO2004074429A3 (en) 2004-09-30
EP1597395A2 (en) 2005-11-23
US20120028812A1 (en) 2012-02-02

Similar Documents

Publication Publication Date Title
US9096951B2 (en) Method for producing second-generation library
US20220205027A1 (en) Method for obtaining structural information concerning an encoded molecule and method for selecting compounds
US9885035B2 (en) Method for the synthesis of a bifunctional complex
EP1723255B1 (en) Method for enrichment involving elimination by mismatch hybridisation
EP3018206B1 (en) Enzymatic encoding methods for efficient synthesis of large libraries
US10202637B2 (en) Methods for analyzing nucleic acid
US9957549B2 (en) Compositions and methods for negative selection of non-desired nucleic acid sequences
CN107250447A (en) A kind of DNA long fragment library constructing method
AU2015243130B2 (en) Systems and methods for clonal replication and amplification of nucleic acid molecules for genomic and therapeutic applications
WO2012003374A2 (en) Targeted sequencing library preparation by genomic dna circularization
US20060269920A1 (en) Method for obtaining structural information about an encoded molecule
US20210261944A1 (en) Compositions and methods for ordered and continuous complementary DNA (cDNA) synthesis across non-continuous templates

Legal Events

Date Code Title Description
AS Assignment

Owner name: NUEVOLUTION A/S, DENMARK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FRESKGARD, PER-OLA;GOULIAEV, ALEX HAAHR;THISTED, THOMAS;AND OTHERS;REEL/FRAME:023429/0783;SIGNING DATES FROM 20060303 TO 20060401

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION