Free Sounds Guidebook

Learn American English Reductions and Rhythm with Seinfeld!


(Video Transcript)


Hi, I’m Julie with San Diego Voice and Accent, and I hope you’re ready to laugh! In this video, you’ll learn all about American English pronunciation as I complete a sentence analysis task of a scene from the American TV show Seinfeld.


In this sentence analysis video, I’ll play you a scene from a TV show, and then I’ll analyze each sentence and describe exactly what that person said and how they said it. I suggest you grab a pen and paper so you can take notes because I’m going to discuss all areas of pronunciation - vowels, consonants, reductions, stress,  linking, intonation, vocabulary - so there’s going to be a lot of information to digest.


First you’ll watch the full conversation, then I’ll complete the analysis.

Analysis of American English pronunciation, stress, reductions, intonation, and more


I’m sorry, we have no mid-size available at the moment.

First, let’s talk about sentence-level stress. Which words in this sentence have the most stress? I hear sorry and have as the words with the stress, and I think sorry has the most stress. She also stresses syllables within certain words, like vail in available, but I don’t think the word available stands out as a stressed word.

She begins this sentence with the words “I’m sorry,” which technically means she is apologizing for something. But, her intonation pattern doesn’t match with her words. She says I’m sorry, but her intonation pattern says she isn’t being sincere - and she doesn’t really care about this situation. She also uses flat facial expressions that convey the message: I don’t care.

Let’s look at some of her linking patterns. The word mid-size is a two syllable word, and the first syllable ends in a D consonant. The next syllable begins with an S consonant. This is a type of consonant to consonant linking, and in this word she uses an unreleased D consonant to link the two syllables together. She doesn’t say mid size, she doesn’t release her tongue from the alveolar ridge as she says mid. Her tongue tip stays up at the alveolar ridge, and then she transitions directly to the S consonant.

Then she uses consonant to vowel linking to link size to available. Si-zuh-vailable. It’s almost like the final Z in size moves over to the beginning of the next word, available. 

Then she uses another form of consonant to vowel linking as she links the dark L in available to the word at. And to link these words together, she uses a Light L. She pronounces the dark portion of the L in available, then her tongue tip comes up to the alveolar ridge to make a light L as she links it to the word at. Available at. Available at.

Then she links the word at to the word the by using a stop T. This is common when a word or syllable ends in a T sound and the next word or syllable begins with a consonant. At the. At the.

She reduces the word at to a quick schwa vowel and a glottal T. And I’m guessing that she also used the reduced pronunciation of the voiced TH at the beginning of the word the. The reduced pronunciation of the voiced TH is made by keeping the tongue tip inside the mouth and pushing the tongue tip against the back of the upper and lower teeth. This type of pronunciation is common when the voiced TH is at the beginning of unstressed words or syllables, like the word the in this sentence.

She uses a true T sound at the end of moment, which means she releases her tongue and a small puff of air comes out as she says the T sound. Moment. Moment. Moment. When the last word in a sentence ends in a stop consonant, like the T, D, P, B, K, or G consonants, it is up to the speaker to decide if they want to use a true, released consonant, or an unreleased consonant. In this sentence, she chooses to use a true, released T consonant, moment. 

I don’t understand - I made a reservation. Do you have my reservation? 

Let’s start with sentence-level stress. Which words are stressed the most? I hear, understand and reservation in the first sentence, and have and reservation in the second sentence. And I think the word reservation has the most stress in both sentences.


He says this first part really fast, I don’t understand, and he uses a lot of reductions. He reduces the diphthong AI to just the first part of the diphthong, which is common to do because the word I is a pronoun, and pronouns are often unstressed and reduced in spoken English. Then he deletes the T in don’t, which helps him to say it faster. He uses what’s called a nasal flap here, and it’s also called the vanishing T. When NT occur next to each other between vowels, and the second vowel is unstressed, the T consonant can be deleted, and you only pronounce the N consonant. This process is not obligatory - native speakers don’t have to do it - but it is common in certain words like internet and interview. There are Ts in both of those words, but they aren’t always pronounced. That’s what happens in this phrase - don’t understand. The T is deleted, and he says don-understand.


Then he also deletes the first D in understand, and possibly the second D but I’m not quite sure. He says something like “unnerstan”, which is not the standard pronunciation, but in fast, informal conversation, native speakers may use this pronunciation


Next, he reduces the word a to uh, the schwa vowel. This is a very common reduction in spoken English. The word a is an article, and articles are function words. Function words are typically reduced in spoken English. 

He also reduces the OO vowel in the word you. The vowel is shorter and doesn’t have the full vowel gliding or lip rounding that you hear when the OO vowel is stressed. 


Now let’s look at the linking. Since he deleted the T in don’t, he uses the final N consonant to link to the word understand. Don-understand. Then he uses consonant to vowel linking to link made to uh, may-duh. It’s almost as if the final D in made moves over to the beginning of the next word, uh. 


Finally, we’ll look at the intonation of this clip. His first sentence is a statement, so his intonation goes down at the end. I don’t understand - I made a reservation. Then his intonation goes up - do you have my reservation? - because this is a yes/no question. The answer is either yes or no.


Yes, we do. Unfortunately, we ran out of cars.

Which words do you hear as having the most stress? I hear yes and do in the first sentence, and ran and cars in the second sentence. Yes has the most stress in the first sentence, and cars has the most stress in the second sentence. 


The first sentence is “Yes, we do”. She could have just said yes, and that would have answered his question. “Do you have my reservation?” “Yes.” But she adds “we do” to her response. She does this for emphasis - Yes, we do - to make sure he understands the answer is definitely Yes. It’s like she’s saying yes twice. You can do this with other types of phrases if you want to emphasize your answer, like "Yes, I can," or "No, I can’t," or "Yes, it’s true," or "No, it’s not true." 


Unfortunately. This is a 5-syllable word, so it can be a little tricky to pronounce. The second syllable has the primary stress - un - FOR - chuh- nit- lee. Let’s look at the way the Ts are pronounced in this word. The first T makes the CH sound, like in the word chair. So it’s spelled with the letter T, but it makes the CH sound. CH. CH. CH. And the second T is the glottal stop T. Un-for-chuh nit lee. Nit lee, nit lee - that T sound is made with the vocal cords, not the tongue. It’s not, unfortunately, with a True T sound. It’s, unfortunately, with a stop T. When the T consonant is at the end of a word or syllable, and the next word or syllable begins with a consonant, the T sound is typically a glottal stop T or an unreleased T. In this example, she uses a glottal stop T, and that’s also how I pronounce this word.


She uses the phrasal verb, ran out of cars.”This phrasal verb, to run out of something, means to use all of your resources or inventory - there’s nothing remaining. So she is saying that they have sold all of their inventory of mid-size cars. Then she links ran to out by extending the N sound, rannout. The T in out turns into the flap here, because when T is between vowels and the second vowel is unstressed, the T consonant typically becomes the flap, which is a light D sound. Ran out of, ran out of. 

But the reservation keeps the car here. That’s why you have the reservation.

Which words do you hear as having the most stress? In this first sentence, he stresses here the most - his voice builds up to it - HERE. There is also some secondary stress on reservation. In the second sentence, the word have has the most stress, and again, his voice builds up to it. 


Let’s look at the reductions in these sentences. It’s hard to be sure, but I think he uses the reduced pronunciation of the voiced TH in the because the word the is unstressed in this sentence, and it’s very common to reduce the pronunciation a little bit so that it’s easier to day. To make the reduced pronunciation of the voiced TH consonant, the tongue tip stays inside the mouth and makes contact with the back of the upper and lower front teeth. The, the, the. Be careful not to make a D sound, D, D, D. The tongue tip needs to be lower and touching the back of the front teeth, not at the alveolar ridge.


He most likely also uses the reduced pronunciation of the voiced TH in the, that’s, and the, again for the same reasons - these words are unstressed in these sentences, and it’s much easier to pronounce these words with the reduced tongue placement.


He also reduces the word you to yuh. That’s why yuh have the reservation. This is a type of reduction that I don’t use very often, maybe only in certain phrases like “whatcha up to?” or “whatcha doin?” I don’t typically say yuh for you; I think it sounds too informal.


He uses good linking throughout these sentences to link the words and syllables together, and there aren’t many actual linking rules that he follows - rather, he just smoothes out the connections between the words as best as possible. But here are a few actual linking rules that he does use: He uses a glottal stop T in the word but, to link but and the, because the word the begins with a consonant. But the, but the. Then he links why and yuh together with continuous vocal cord vibration - almost like he uses a quick Y sound as the link. 

I know why we have reservations.

I don’t think you do!

Let’s look at the sentence-level stress. I hear, I and know as the words that are stressed in the first sentence, and know has just a little bit more stress. It’s not common to stress two words in a row, especially when one is a function word, like I, and the other is a content word, like know. But she chooses to stress both of these words for emphasis - she’s trying to make a clear statement that she knows the reason for reservations.


Then in the second sentence, I hear I and do as the words with the stress, and do has the most stress. The pitch of his voice builds up to the final word, DO, for emphasis. 


Let’s look at the intonation patterns that they use in these sentences. She uses a downward intonation pattern on individual words in this sentence as well as an overall downward intonation pattern at the end. This is to show confidence in what she is saying.


He uses an intonation pattern that builds up to the word do, and then when he gets to do, the pitch of his voice goes up and down really quickly. Do, do, do. This is done for emphasis, and the downward intonation at the very end of do also conveys that he is confident about what he said.


She links know and why together so that they sound like one word “knowwhy”. It’s not, know why. She simply continues with the OW diphthong in “know” as she says the W the next word why - these sounds have similar mouth placements. But she changes her intonation between the two words and she pauses slightly, know why. The word know has higher intonation and is said for a longer duration - with more stress - and you can clearly hear she is saying two different words.


He reduces the word you to yuh, the schwa vowel. Yuh, yuh yuh. The initial TH in think gets the full pronunciation here because this is the voiceless TH sound, and that always receives the full tongue placement. The very tip of the tongue goes between the front and bottom teeth, and the air exits the mouth where the top teeth touch the tongue surface. It’s a relatively narrow airflow. 


He uses a very quick stop T in don’t as he links it to the next word, think, because the next word begins with a consonant. Don’t think. Don't think. He also links “think” and you together, and it’s almost like the K in think begins the next word you, thing-kyou, thing-kyou. 


If you did, I’d have a car. So you know how to take the reservation. You just don’t know how to hold the reservation. 

He does a lot of fun things here with the stress. Which words do you hear as having the most stress? I hear, did and car in the first sentence. Then know and take in the second sentence. Then just, know, and hold in the last sentence. I think the words with the most stress are did, car, take, and hold. He stretches out the duration of the word hold for emphasis, and he pauses slightly to add even more emphasis.


Let’s look at his intonation of the word did. He uses an up, down, up intonation pattern - did, did, did, did.  This is done for emphasis, and it’s something that his character does often throughout this show. He plays around with his intonation a lot, and that adds to the comedy of this show.  


He reduces a to uh, and says have-uh, have-uh. And then he uses a very light released D in I’d to link this word to have. I’d have a, I’d have a.


Then in the next sentence, he reduces the phrase how to to “how duh”. The word to turns into the flap and a schwa. Duh, duh, how duh. The T can become the flap when T comes between vowels and it’s in an unstressed syllable. Here you have the OW vowel in how and the schwa uh in “tuh,” so it becomes how duh take. 


The K in take isn’t fully released. Take the, take the. The back of the tongue comes to the back of the mouth for the K sound, and then the tongue stays there, and that’s the end of the K sound. 


Here he reduces the OO in you to a very quick OO vowel. You, you, you. Then he uses the same reduction of how to as he did before - he reduces this phrase to how duh.


Then to link this sentence together, he deletes the T in just, which is very common to do in spoken English when the following word begins with a consonant. Jussdon’t. Jussdon’t. He uses a very light stop T in don’t - it happens quickly. Don’t know, don’t know. I also typically use a stop T in this phrase “don’t know”. A stop T is made with the vocal cords, not the tongue. It’s also called a Glottal Stop or a Glottal Stop T - those terms mean the same thing.  

And that’s really the most important part of the reservation - the holding. 

Here’s another example of fun stress and intonation patterns. Which words do you hear as having the most stress? I hear, that’s really, most, important, and part in the first phrase, and holding in the second phrase. And the words really and holding have the most stress.


Let’s look at the reductions that he uses in this clip. He drops the D in and, and just says “an. An that’s. This is a very common reduction in spoken English, and it occurs when the word and is followed by a word that begins with a consonant. The reason why the D sound is often dropped in this situation is because of the Rule of Three. When three consonants are in a row, either within a word or across word boundaries, the middle consonant is often dropped. 


I think he uses the full tongue position for the voiced TH in that’s, because he stresses the word that’s in this sentence, so the tongue tip should come out for this TH. But later on in the sentence, I think he does use the reduced pronunciation of the voiced TH sound in the words the. These words aren’t stressed in this sentence, so they will most likely be reduced. He also reduces the schwa in the to a very quick schwa sound, the, the, the.


Now looking at the linking, he does a nice job of smoothing out the connections between words, even when there isn’t an actual linking rule that can be applied. The word most links up with the word important, and here you have two consonants linking up with a vowel. You sometimes hear natives speakers apply the Rule of Three to this type of phrase, and they say “mohs-important” - they may delete the final T in most. That can happen in fast speech, even though it’s technically not the Rule of Three, since there aren’t three consonants in a row. In this conversation, I think the character pronounces that final T in most, but it happens very quickly. Most important.


Then to link the word important to the word part, he uses a stop T. This is common when the T sound links up to a consonant - it often turns into a stop T or an unreleased T. But let’s look at how he links part and of. Here you have a T sound that is between two vowels, and the second vowel is unstressed, which means the T sound typically turns into the flap. Part of, part of, part of.


Important. He pronounces it with two stop Ts, important, important. He doesn’t use the true T here - it’s not, important, important. He uses his vocal cords for the T sounds. Important. Import-nt.


And lastly, let’s look at his overall intonation pattern. He uses an upward intonation on the word reservation to signal that he’s not finished talking - he wants to say more about this topic - then he goes down on holding to indicate he is finished talking.


Anybody can just take them.

Let me, uh, speak with my supervisor.

Which words do you hear as having the most stress? I hear, anybody and take in the first sentence, and let, speak, and supervisor in the second sentence. I think anybody has the most stress of the first sentence, then let in this first phrase, and supervisor as the word with the most stress in the second phrase.


Let’s talk about the reductions first. He reduces can to the schwa uh, kun, kun. This is a very common reduction. The word can is a function word, and when it’s unstressed in a sentence, it is typically reduced. He also reduces the pronoun them to um. He drops the initial TH and reduces the vowel to the schwa uh vowel, um, um, um, take um. Pronouns that begin with H and TH, like him, her, his, and them, can be reduced in spoken English. Sometimes the initial consonant is dropped and the vowel is reduced. 


Now for the linking, I hear him delete the final T in just as he links just with take. Jusstake. Jusstake. This is common when the word just links up with a word that begins with a consonant because of the Rule of Three - three consonants in a row, the middle consonant is often deleted.


She uses a stop T in let because the next word begins with a consonant. Let me, let me. Then she also links the word me with a thinking word, uh, and it sounds like “mee-uh”.  And she links speak and with together, and it’s almost like the K in speak begins the next word, with. Spea-kwith, spea-kwith.  


And finally, let’s look at her overall intonation pattern. Her overall intonation pattern goes down at the end, but she doesn’t sound very excited about speaking with her supervisor - she doesn’t sound sincere. 


Now for the full conversation with the analysis.


Wow - that was a lot of information for you to digest, but I hope it was both useful and entertaining. Good luck as you practice your English pronunciation, and if you want any help from me, join me for a live English class at English Pro™ Live. The link to join is in the description below. Have a great day!


And I'd love to hear from you - contact me to learn how we can work together to perfect your American English pronunciation!

Julie Cunningham | San Diego Voice and Accent Julie Cunningham | San Diego Voice and Accent Julie Cunningham | San Diego Voice and Accent

Are you ready to transform your English skills, but you’re not sure where to start?

Start here!

Sign up to receive my free guidebook to the sounds of American English! Learn how to pronounce every sound of American English with close-up pictures, phonetic symbols, and real-life MRIs!

Get the free guidebook!