Dianeva and stormcrow; it looks like you're talking past each other. There are two different concepts.
The probability of X, for instance X = "throwing two coins and getting a head and a tail", basically just means look at all possible events (the 'event space'), for instance "heads heads, heads tail, tail heads, tail tail", and take the ratio of the number of those events which are correctly described by X, for instance "heads tail, tail heads", to the total number of possible events (regardless of X or not X), which in this example gives 1/2.
P(A & B) means the probability that both A and B occur. Obviously this will always be less than or equal to P(A) [or P(B)], because otherwise you'd be saying that more events could be described by 'A&B' than just 'A'; but obviously that's not true, because for A&B to be true, A must be true. That is to say, A&B is a subset of A, and so it's smaller. stormcrow's example about feminist bankers is a good one for this.
The source of confusion for these kind of things is usually that people have some vague and rather incorrect notions of what 'probability' means. In reality, it's just set theory. Stop thinking about uncertainty. We're just measuring sizes of sets. Therefore it's very useful to consider a Venn diagram, shown below. U represents all possible outcomes. A represents all outcomes which could be described by "A"; B likewise, and where these two coincide is "A&B". It's now easy to see that P(A&B), the ratio of the size of A&B to the size of U, will always be smaller than or equal to P(A), the ratio of the size of A to the size of U, because clearly A&B is smaller than A.
Now the second concept I mentioned. This is P(AB), "the probability of A given B". This means that we take it as given that B has already occurred, and then ask what the probability of A would be in light of this. If again you look below, you should actually be able to derive a formula for this. We're taking B as true, so B, rather than U, is our set of possible events. We now want all instances described by A, in B. This is just P(A&B). So the conclusion is that the probability of A, given B, i.e. the ratio of the number of possible events where A occurred to the total number of possible events, given B, is just P(A&B)/P(B).
And if you play around with circles, you should find that you can indeed make P(AB) take any possible value between 0 and 1, regardless of the size of P(A). Sometimes B makes A more likely (if A is positively linked with B somehow, for instance B = "it's raining", A = "you are sad"), sometimes B makes A less likely, and sometimes A is independent from B, i.e. the size of A in B relative to B is the same as the size of A relative to U (for instance, tossing two coins, and A = "the first is heads", B = "the second is heads").


Bookmarks