The size of a three-hour recording at 192 kbps. Mp3 Let's sort it out in order. Before converting music to another format, you should "decompress" it to WAV

Over the past few years, the MP3 format has become terribly trendy and popular. On any tray that sells computer CD-disks, you can easily find more than a dozen discs of the type "Complete Anthology of XXX Group", and below this modest inscription - MP3. Most often, for a complete picture, the fashionable phrase CD quality flaunts on the covers - that is, quality, like that of Audio-CD. It is about this and not only our story will be further - about MP3s, what they are, about the quality of sound in MP3.

About MP3 format

Let's start with a little understanding of the subject area. What is this MP3 anyway?

MP3, more correctly called MPEG-1 Layer 3, is a lossy audio compression standard. At the same time, the main goal in creating the standard was to ensure the most "identical" source sound, as well as to minimize the amount of stored data. For this, an original coding scheme was created - at the first stage, the digitized sound is divided into frequency components, which pass through a series of filters.

The main difference between MP3 and previous standards is in filtering. The developers of the standard have created the so-called psychoacoustic model - a model that takes into account some features of human hearing, and on the basis of this model those frequencies are filtered out of the audio signal, the absence of which the hearing almost does not notice. At the second stage, the resulting stream is encoded using the Huffman algorithm with a static table. The result will be the MP3 stream.

In addition, ID3 tags (tags containing the name of a song, artist, other information) and various service information can also be added to the MP3 file.

Compression modes and bitrates

Stream Width - Bit rate determines how many bits are needed to encode 1 second of music. The MP3 standard regulates streams from 8kbit / s to 320kbit / s. The most typical bitrate is 128kbit / s.

Based on the stream, it is easy to calculate how much one minute of music will take - divide the bitrate by 8 (the number of bits per byte) and multiply by 60 (seconds per minute) - we get the number of kilobytes. For the already mentioned 128kbit / s stream, this will be 128/8 * 60 \u003d 960 kilobytes or about a megabyte for every minute of recording.

It is quite natural that the higher the bit rate, the more details of the sound can be preserved, the more realistic it sounds. In the choice of bitrate when encoding, you have to sacrifice something - either quality in favor of small size, or size in favor of quality.

The simplest MP3 compression mode is Constant BitRate (CBR). Previously, on MP3 assemblies, the already mentioned 128kbit / s bitrate was used almost 100% - and the CD quality inscription was present on the discs. Frankly, this is just a blatant lie. In practice, it is impossible to distinguish the sound of such an MP3 from the sound of an audio CD only on the cheapest acoustics.

The quality level at a bitrate of 128kbit / s is approximately the sound level of an average tape recorder on not the freshest tape, maybe a little better. You can also add that it is this bit rate that is widespread in recordings available on the Internet.

To simplify the parsing of higher bitrates, I will write their grid: 128kbit / s, 160kbit / s, 192kbit / s, 224kbit / s, 256kbit / s, 320kbit / s. So, the bit rates of 160 and 192kbit / s are already noticeably better in quality than 128kbit / s, but the resulting files are still not so large. "Artifacts" (flaws) of the codec are almost invisible (at least on my system).

I have never encountered a bitrate of 224 in its pure form, so I can't say anything about its quality, but it should be higher than on the previous step of the bitrate ladder. In addition, I have not come across reviews that covered this bitrate as well. Apparently this is somehow connected with the fact that the next bit rate 256kbit / s is recognized in terms of the accuracy of sound transmission, almost complete absence of distortion. In the instructions for the Lame codec, this bitrate is even named as Studio Quality. And the ceiling itself - 320kbit / s is intended for those who value quality the most, or for owners of very high-quality Hi-Fi or even Hi-End equipment.

Now let's move on to a slightly more complex issue - the variable bitrate mode (VBR, Variable BitRate). Here the concept of bitrate is very vague, codecs "for the user" generally use only quality control (like in Xing Audio Catalyst). Others (Lame) allow you to set additional parameters - minimum and maximum bitrates, again quality.

When encoding VBR, the codec itself chooses the desired bitrate, based on the parameters set to it, and during the encoded fragment, the bitrate can change. To estimate the required bitrate, the already mentioned psychoacoustic model is used. However, the model (since it is not perfect, nothing in our world is perfect) sometimes shows incorrect results. This leads to an underestimation, and, accordingly, a drop in the actually audible sound quality.

The Lame codec developers advise in this case to set the minimum bitrate threshold in order to avoid very bad results. VBR also includes ABR (Average BitRate) encoding, the average bit rate. Recently, reviews have only heard positive responses about this mode, especially ABR at 256kbit / s. This mode works in much the same way as VBR, with the exception that the codec adheres to the average setpoint. At the moment I know of only one codec that has an ABR mode - this is Lame.

Codec selection

Just recently, a user who wanted to get decent quality MP3 had not a very large choice - this is an ISO-based codec (based on the sample code of an MP3 codec issued by the International Standards Organization), or a codec from IIS Fraunhofer (an institute that develops MP3 ). Plus codecs in Xing products.

After reading various reviews and doing a little research of my own, I came to the conclusion that the Xing branch of products is ... it's better not to use them. Even in relatively new versions, all their products that are able to create MP3s with built-in tools do it as poorly as possible.

There are also a lot of "pioneer" handicrafts made on the codec stolen from Xing (almost all contain the tompg.exe file). For a long time their main advantage was speed (at the expense of quality), but today the Lame codec shows comparable speed at higher quality. In addition, Xing products generally cost money, while Lame is free by definition.

Next, I'll go through the IIS Fraunhofer products. All of their free MP3 compression programs are heavily pared-down versions of their own commercial products. Then, all of their codecs have not evolved for a long time, and do not contain new tools, VBR / ABR support, and besides, they do not differ in special performance. Their only justified use is compression at bitrates below 128kbit / s - they have been specially optimized for low bitrates (in some places, however, in violation of the standard).

Various codecs based on ISO code suffer in principle from the same drawback - low-quality compression at bitrates below 192kbit / s. In addition, most of them (including BladeEnc) are pretty slow.

In my opinion, the Lame codec is the best option. It started out as a free ISO-based codec, but has grown during development and is now used as a reference for MP3 when comparing MP3 to other formats. A little over a year ago, the Lame project finally got rid of the ISO code and can now be considered a completely independent codec.

The development of the codec is quite intensive, it is constantly updated and bugs are fixed. In addition, it is possible to use Lame not only under Windows, but also under various Unix systems, it also works in pure DOS. Again, completely free, the source code is available (for those who like to dig into it), from several sites already compiled binaries (.exe and.dll) are available, optimized for different processors.

There is also a slightly stripped-down version of Lame - the GOGO-no-coda encoder, which shows fantastic results in speed (twice as fast as the already fast Lame).

So which bitrate and which mode should you use?

Considering all of the above, I would recommend archiving MP3s with either 320Kbit / s, CBR mode, or 256kbit / s, ABR. The first, in my opinion, is somewhat preferable, tk. you get the highest quality available within the format. For "listen and erase" recordings it is reasonable to use ABR 192kbit / s.

And one more thing - it is better not to use bitrates below 192kbit / s for any long storage - unless the recording with which the MP3 was made is always at your fingertips (although remember that analog tape recording degrades over time) ...

Very often the argument that I hear in favor of low bitrates and “crooked” compression is “I have bad acoustics, and I still can't hear the difference”. Everything can change, or you have to use your archive on decent equipment, and you will not be able to get to the original recording. The answer is absolutely not far-fetched, I can cite a case from my own practice.

In our city Pavlovo there was once a small club where music was played from a computer (MP3 with a bitrate of no higher than 160kbit / s). Then the club died happily, and the computer with the music archives moved to another company that was involved in holding public events. Imagine that they set out to play this music at the bottom of the city! The horror when, on more or less decent acoustics, all the defects introduced by packaging at such a small bitrate were heard. The sound was worse than from their well-worn cassette recorder with half-chewed cassettes. It would be wise to avoid repeating other people's mistakes, right?

Test hardware and software

Computer: Athlon TB 650MHz, M / B Acorp 7KTA 100MHz FSB, 128Mb RAM PC-133, HDD Quantum 40Gb 5400rpm, SoundBlaster 16 Vibra, AC97 codec.
Audio system: amplifier Radiotehnika U-7111, pair of speakers Radiotehnika S-90B.
Software: OS Windows98 SE, Winamp 2.75, Eac 0.9pb11, Lame 3.90a, GOGO-no-coda 3.07a

Debunking popular myths about digital audio.

2017-10-01T15: 27

2017-10-01T15: 27

Audiophile "s Software

Note: For a better understanding of the text below, I highly recommend that you familiarize yourself with the basics of digital audio.

Also, many of the points discussed below are highlighted in my publication "Once again about the sad truth: where does good sound really come from?" ...

The higher the bitrate, the better the track

This is not always the case. First, let me remind you what bitrey is t (bitrate, not bitraid). In fact, this is the data rate in kilobits per second during playback. That is, if we take the track size in kilobits and divide by its duration in seconds, we get its bitrate - the so-called. file-based bitrate (FBR), usually it does not differ too much from the bitrate of the audio stream (the reason for the differences is the presence of metadata in the track - tags, "embedded" images, etc.).

Now let's take an example: the bitrate of uncompressed PCM audio recorded on a regular Audio CD is calculated as follows: 2 (channels) × 16 (bits per sample) × 44100 (samples per second) \u003d 1411200 (bps) \u003d 1411.2 kbps ... Now let's take and compress the track with any lossless codec ("lossless" - "lossless", that is, one that does not lead to the loss of any information), for example, the FLAC codec. As a result, we will get a bitrate lower than the original one, but the quality will remain unchanged - here's your first refutation.

Something else is worth adding here. The output bitrate with lossless compression can be very different (but, as a rule, it is less than that of uncompressed audio) - it depends on the complexity of the compressed signal, or rather on the data redundancy. Thus, simpler signals will compress better (i.e. we have a smaller file size for the same duration \u003d\u003e lower bit rate), and more complex signals will be worse. That is why lossless classical music has a lower bit rate than, say, rock. But it should be emphasized that the bit rate here is by no means an indicator of the quality of the sound material.

Now let's talk about lossy (lossy) compression. First of all, you need to understand that there are many different encoders and formats, and even within the same format, the encoding quality for different encoders may differ (for example, QuickTime AAC encodes much better than the outdated FAAC), not to mention the superiority of modern formats (OGG Vorbis, AAC , Opus) over MP3. Simply put, out of two identical tracks encoded by different encoders with the same bitrate, some will sound better, and some will sound worse.

In addition, there is such a thing as upconvert... That is, you can take a track in MP3 format with a bitrate of 96 kbps and convert it to MP3 320 kbps. Not only will the quality not improve (after all, the data lost during the previous 96 kbps encoding cannot be returned), it will even worsen. It is worth pointing out that at each stage of lossy encoding (with any bit rate and any encoder) a certain portion of distortion is introduced into the audio.

And even more. There is one more nuance. If, say, the bitrate of an audio stream is 320 kbps, this does not mean that all 320 kbps were spent encoding that very second. This is typical for constant bitrate encoding and for those cases where a person, hoping for maximum quality, forces a constant bitrate too high (for example, setting 512 kbps CBR for Nero AAC). As you know, the number of bits allocated to a particular frame is regulated by the psychoacoustic model. But in the case when the allocated amount is much lower than the set bitrate, even the reservoir of bits does not save (for the terms, see the article "What is CBR, ABR, VBR?") - as a result, we get useless "zero bits" that simply "finish off »Frame size to the desired one (ie, increase the stream size to the specified one). By the way, this is easy to check - compress the resulting file with an archiver (better than 7z) and look at the compression ratio - the more it is, the more zero bits (because they lead to redundancy), the more wasted space.

Lossy codecs (MP3 and others) are able to cope with modern electronic music, but they are not able to efficiently encode classical (academic), live, instrumental music

The "irony of fate" here is that in fact everything is exactly the opposite. As you know, academic music in the overwhelming majority of cases follows melodic and harmonic principles, as well as instrumental composition. From a mathematical point of view, this leads to a relatively simple harmonic composition of the music. So the predominance of consonances produces fewer side harmonics: for example, for a fifth (the interval in which the fundamental frequencies of two sounds differ by one and a half times), every second harmonic will be common for two sounds, for a fourth, where the frequencies differ by one third - every third, and etc. In addition, the presence of fixed ratios of frequencies, due to the use of equal temperament, also simplifies the spectral composition of classical music. The live instrumental composition of the classics determines the absence of noises characteristic of electronic music, distortions, sharp jumps in amplitude, as well as the absence of excess high-frequency components.

The factors listed above lead to the fact that classical music is much easier to compress, primarily purely mathematically. If you remember, mathematical compression works by eliminating redundancy (describing similar pieces of information using fewer bits), and also by predicting (so-called. predictors predict the behavior of the signal, and then only the deviation of the real signal from the predicted one is encoded - the more closely they match, the fewer bits are needed for encoding). In this case, a relatively simple spectral composition and harmonicity cause high redundancy, the elimination of which gives a significant degree of compression, and a small number of bursts and noise components (which are random and unpredictable signals) cause good mathematical predictability of the vast majority of information. And this is not to mention the relatively low average volume of classical tracks and the frequent intervals of silence, for the encoding of which information is practically not required. As a result, we can losslessly compress, for example, some solo instrumental music to bit rates below 320 kbps (the TAK and OFR encoders are quite capable of this).

So, firstly, the fact is that the mathematical compression underlying lossless encoding is also one of the stages of lossy encoding (read Understandably about MP3 encoding). And secondly, since lossy uses the Fourier transform (decomposition of the signal into harmonics), the simplicity of the spectral composition even makes the coder's work doubly easier. As a result, comparing the original and the encoded sample of classical music in a blind test, we are surprised to find that we cannot find any differences, even at a relatively low bitrate. And the funny thing is that when we start to completely lower the encoding bitrate, the first thing that detects the difference is background noise in recording.

As for electronic music, coders have a very difficult time with it: noise components have minimal redundancy, and together with sharp jumps (some sawtooth pulses) are extremely unpredictable signals (for coders who are "sharpened" for natural sounds that behave completely otherwise), the direct and inverse Fourier transform with the rejection of individual harmonics by the psychoacoustic model inevitably gives pre- and post-echo effects, the audibility of which is not always easy for the encoder to evaluate ... Add to this high level HF components - and you will get a large number of killer samples, which even the most advanced encoders cannot cope with at medium-low bitrates, oddly enough, just among electronic music.

Also amusing are the opinions of "experienced hearers" and musicians, who, with a complete misunderstanding of the principles of lossy coding, begin to claim that they hear how instruments in music, after encoding, begin to fake, frequencies float, etc. This, perhaps, would still be true for antediluvian cassette players with detonation, but in digital audio everything is accurate: the frequency component either remains or is discarded, there is simply no need to shift the tonality. Moreover: a person's ear for music does not at all mean that he has good frequency hearing (for example, the ability to perceive frequencies\u003e 16 kHz, which diminishes with age) and does not make it easier for him to search for lossy coding artifacts, since distortion these have a very specific character and require the experience of blind comparison of precisely lossy audio - you need to know what and where to look for.

DVD-Audio sounds better than Audio CD (24 bit versus 16, 96 kHz versus 44.1, etc.)

Unfortunately, people usually look only at numbers and very rarely think about the impact of a particular parameter on the objective quality.

Let's first consider the bit depth. This parameter is responsible for nothing more than the dynamic range, i.e., the difference between the quietest and loudest sounds (in dB). In digital audio, the maximum level is 0 dBFS (FS - full scale), and the minimum is limited by the noise level, that is, in fact, the dynamic range is equal in magnitude to the noise level. For 16-bit audio, the dynamic range is calculated as 20 × log 10 2 16, which equals 96.33 wB. The dynamic range of a symphony orchestra is up to 75 dB (mostly about 40-50 dB).

Now let's imagine the real conditions. The noise level in the room is about 40 dB (do not forget that dB is a relative value. In this case, the hearing threshold is taken as 0 dB), the maximum music volume reaches 110 dB (so that there is no discomfort) - we get a difference of 70 dB. Thus, it turns out that a dynamic range of more than 70 dB in this case is simply useless. That is, at a higher range, either loud sounds will reach the pain threshold, or quiet sounds will be absorbed by the surrounding noise. It is very difficult to achieve a level of ambient noise less than 15 dB (since the volume of human breathing and other noises caused by human physiology is at this level), as a result, a range of 95 dB for listening to music is quite sufficient.

Now about the sampling rate (sampling rate, sample rate). This parameter is responsible for the time sampling rate and directly affects the maximum signal frequency that can be described by this audio representation. By Kotelnikov's theorem, it is equal to half the sampling rate. That is, for a normal sampling frequency of 44100 Hz, the maximum frequency of the signal components is 22050 Hz. The maximum frequency. which is perceived by the human ear - just above 20,000 Hz (and even then, at birth; as we grow older, the threshold drops to 16,000 Hz).

This topic is best covered in the article 24/192 Downloads - Why They Don't Make Sense.

Different software players sound differently (e.g. foobar2000 is better than Winamp, etc.)

To understand why this is not the case, you need to understand what a software player is. In fact, this is a decoder, handlers (optional), an output plugin (to one of the interfaces: ASIO, DirectSound, WASAPI. Etc.), and of course the GUI (graphical user interface). Since the decoder in 99.9% of cases works according to the standard algorithm, and the output plug-in is just a part of the program that transmits a stream to the sound card through one of the interfaces, the reason for the differences can only be handlers. But the fact is that handlers are usually disabled by default (or should be disabled, since the main thing for good player - to be able to convey sound in its "original" form). As a result, the subject of comparison here can only be capabilities processing and output, which, by the way, is very often unnecessary. But even if there is such a need, then this is already a comparison of handlers, not players.

Different driver versions sound different

This statement is based on a banal ignorance of the principles of sound card operation. The driver is software, which is necessary for the effective interaction of the device with the operating system, also usually provides a graphical user interface for managing the device, its parameters, etc. The sound card driver ensures that the sound card is recognized as a sound windows devices, informs the OS about the formats supported by the card, provides the transfer of an uncompressed PCM stream (in most cases) to the card, and also gives access to the settings. In addition, in the case of software processing (by means of the CPU), the driver can contain various DSPs (handlers). Therefore, firstly, with disabled effects and processing, if the driver does not provide accurate transfer of PCM to the card, this is considered a gross error, a critical bug. And it happens rarely... On the other hand, the differences between the drivers can be in the updating of processing algorithms (resamplers, effects), although this also does not happen often. In addition, effects and any driver processing should still be excluded to achieve the highest quality.

Thus, driver updates are mainly focused on improving stability and fixing handling errors. In our case, neither the one nor the other affects the playback quality, therefore in 999 cases out of 1000 the driver does not affect the sound.

Licensed Audio CDs sound better than their copies

If during copying there were no (fatal) read / write errors and optical drive the device on which the copy disc will be played, there are no problems with its reading, then such a statement is erroneous and easily refuted.

Stereo encoding mode gives better quality than Joint Stereo

This misconception mainly concerns LAME MP3, as all modern encoders (AAC, Vorbis, Musepack) use only Joint Stereo mode (and this already says something)

For a start, it's worth mentioning that Joint Stereo mode is successfully used with lossless compression. Its essence lies in the fact that the signal before encoding is decomposed into the sum of the right and left channels (Mid) and their difference (Side), and then these signals are separately encoded. In the limit (for the same information in the right and left channels), double data savings are obtained. And since in most music the information in the right and left channels is quite similar, this method is very effective and allows you to significantly increase the compression ratio.

In lossy, the principle is the same. But here, in the constant bitrate mode, the quality of fragments with similar information in two channels will increase (in the limit, it will double), and for the VBR mode in such places, the bitrate will simply decrease (do not forget that the main task of the VBR mode is to stably maintain the specified encoding quality, using the lowest possible bitrate). Since during lossy encoding priority (in the allocation of bits) is given to the sum of the channels to avoid degradation of the stereo panorama, dynamic switching between Joint Stereo (Mid / Side) and normal (Left / Right) frame-based stereo modes is used. By the way, the reason for this delusion was the imperfection of the switching algorithm in older versions of LAME, as well as the presence of the Forced Joint mode, in which there is no auto-switching. IN latest versions LAME Joint mode is enabled by default and is not recommended to be changed.

The wider the spectrum, the better the recording (about spectrograms, auCDtect and frequency range)

Nowadays on forums, unfortunately, it is very common to measure the quality of a track with a "ruler on the spectrogram". Obviously, because of the simplicity of this method. But, as practice shows, in reality everything is much more complicated.

And the point is this. The spectrogram visually demonstrates the distribution of signal power over frequencies, but it cannot give a complete picture of the sound of the recording, the presence of distortions and compression artifacts in it. That is, in fact, all that can be determined from the spectrogram is the frequency range (and partially - the spectrum density in the HF region). That is, at best, by analyzing the spectrogram, you can identify the upconvert. Comparison of the spectrograms of tracks obtained by coding by various encoders with the original is sheer absurdity. Yes, you can identify differences in the spectrum, but it is almost impossible to determine whether they will be perceived (and to what extent) by the human ear. We must not forget that the task of lossy encoding is to provide an indistinguishable result human ear from the original (not with the eye).

The same applies to assessing the quality of encoding by analyzing the output tracks with the auCDtect program (Audiochecker, auCDtect Task Manager, Tau Analyzer, fooCDtect are just shells for a one-of-a-kind console program auCDtect). The auCDtect algorithm also actually analyzes the frequency range and only allows you to determine (with a certain degree of probability) whether MPEG compression was applied at any of the encoding stages. The algorithm is tailored for MP3, so it is easy to "cheat" using Vorbis, AAC and Musepack codecs, so even if the program writes "100% CDDA" it does not mean that the encoded audio is 100% identical to the original one.

And returning directly to the spectra. Also popular is the desire of some "enthusiasts" to turn off the lowpass filter in the LAME encoder at all costs. There is a lack of understanding of the principles of coding and psychoacoustics. First, the encoder trims high frequencies with only one purpose - to save data and use it to encode the most audible frequency range. The extended frequency range can be fatal to the overall sound quality and lead to audible encoding artifacts. Moreover, turning off the cutoff at 20 kHz is generally completely unjustified, since a person simply does not hear frequencies above.

There is a kind of "magic" EQ preset that can significantly improve the sound

This is not entirely true, firstly, because each separately taken configuration (headphones, acoustics, sound card) has its own parameters (in particular, its amplitude-frequency characteristic). And therefore, each configuration should have its own, unique approach. In simple terms, such an EQ preset exists, but it is different for different configurations. Its essence lies in adjusting the frequency response of the path, namely, in "leveling" unwanted dips and surges.

Also, among people far from direct work with sound, setting the graphic equalizer "with a tick" is very popular, which actually represents an increase in the level of low and high frequency components, but at the same time leads to muffling of vocals and instruments, the sound spectrum of which is in the midrange region.

Before converting music to another format, you should "decompress" it to WAV

I'd like to point out right away that WAV means PCM data (pulse-code modulation) in a WAVE container (file with * .wav extension). This data is nothing more than a sequence of bits (zeros and ones) in groups of 16, 24 or 32 (depending on the bit depth), each of which is a binary code of the amplitude of the corresponding sample (for example, for 16 bits in decimal notation these are values \u200b\u200bfrom -32768 to +32768).

So, the fact is that any sound processor - be it a filter or an encoder - usually works only with these values, that is only with uncompressed data. This means that to convert audio from, say, FLAC to APE, simply necessary decode FLAC to PCM first and then encode PCM to APE. It's like repackaging files from ZIP to RAR, you must first unpack the ZIP.

However, if you are using a converter or just an advanced console encoder, the intermediate conversion to PCM happens on the fly, sometimes even without writing to a temporary WAV file. This is what misleads people: it seems that the formats are converted directly to one another, but in fact such a program must have an input format decoder that performs an intermediate conversion to PCM.

Thus, manual conversion to WAV will give you absolutely nothing but wasting time.

The triumphant procession of the MPEG-1 Layer 3 audio recording format (colloquially referred to as MP3) is explained by the fact that a simple and effective compression method was proposed sound fileallowing you to store on a standard cD-ROM up to 12 hours of music of acceptable quality.

To put it simply, the MPEG-1 Layer 3 algorithm is based on the so-called "psychoacoustic" compression method, when frequencies and loudness levels that are not perceptible to the ear are excluded from the sounds of the spectrum. The spectrum "cleaned" in this way is divided into separate blocks (frames) of the same duration and compressed in accordance with the specified requirements. When played back, the signal is formed from a sequence of decoded frames.

The compression ratio depends on the parameters of the audio stream, which must be output after decoding the file.

The main parameter that determines the sound quality and compression ratio is the so-called (what is) bitrate is the bandwidth, measured in bits per second.

The higher the number, the better the sound quality and the lower the compression ratio. Since almost all MP3 files are recorded in stereo with an encoding rate of 44 kHz and a depth of 16 bits, the determining factors for clear sound are: the source of the recording, the codec used and the selected bit rate.

The word codec is formed by combining the words encoder + decoder. This is a software module that allows you to encode or decode audio or video files in accordance with its own algorithm.

The average value of the 256 kbps stream provides a compression ratio of about 6: 1, for other values \u200b\u200bthe compression ratio changes proportionally. Thus, with a 256 Kbps stream, you can record music from six regular Audio CDs to a CD, and with a 128 Kbps stream, you can burn twelve regular music discs.

There is endless controversy among amateurs and professionals about the bit rate that provides good sound quality that matches the playback quality of an Audio CD.

Some consider it sufficient level of 128 Kbps, others are satisfied only by the maximum value of the stream - 320 Kbps. In all likelihood, both are right - the only difference is what is recorded and in what conditions it is reproduced.

The bit rate at which the digitized audio was encoded is usually indicated on the CD cover. For example, the complete Beatles collection is available on three 128Kbps discs or six 256Kbps discs.

It is clear that in the second case the purchase price will be twice as expensive, but the quality is also better.

If music is played in a domestically produced car, the 192 Kbps stream will provide sufficient sound quality, the best you will still not hear due to extraneous noise. For listening on a computer or standalone player ( MRZ-player) 256 Kbps is acceptable.

But if the signal arrives unchanged on external device and output to high quality speakers, the maximum possible stream is desirable - 320 Kbps. Based on the above considerations, a 256 Kbps stream can be considered universal: with good recording quality, it will provide adequate reproduction in most cases.

For broadcasting music over the Internet, a bit rate of 128 Kbps is usually used. At the same time, the sound quality "as it were" leaves much to be desired.

It makes no sense to record popular music with a bitrate higher than 192-256 Kbps: songs do not live long, and the original recordings are often not of high quality. In the end, you can dance to the sound of "tape" quality.

Classics and rare works of authorship is quite another matter. And by the classics we mean not only Bach or Mozart. Today the Beatles, Led, Zeppelin, Vysotsky, Tsoi, and many other authors (performers) can be considered classics.

If, when buying a CD, you did not pay attention to the bitrate value indicated on the package, then you can see the value in the player's line during file playback.

Bitrate (from the English. bitrate) of audio files is the number of bits (units of information) used to store one second of audio. The most common unit of measure for bitrate is kilobits per second (Kbps, Kbps). Bitrate is one of key characteristics media files, affecting their quality and size. The higher the bitrate the music or video was recorded, the better their quality will be and the larger the recording files will be.

Accordingly, changing the bitrate in one direction or another can increase or decrease the file size. But with the impact on the quality of recordings, everything is a little more complicated. While a decrease in the bitrate value naturally leads to a deterioration in the quality of the original file, the opposite operation does not affect the quality in any way. Even if you set the maximum bitrate, the audio and video quality of your file will remain the same.

As you can see, there is little point in increasing the recording bitrate: as a result, you will get a larger file with the same quality. But it is very possible to lower the bitrate in order to reduce the recording size. Want to try changing the bitrate of your songs or movies? Download Movavi Video Converter - convenient program, with which you can easily change the bitrate of video and audio recordings, whether they are files in the popular MP3, WMA, AVI and MP4 formats or recordings placed in more exotic containers. The instruction is written on the example of working with audio files.

1. Install the program to change the bitrate

Download and run the Movavi Video Converter distribution. Follow the instructions on the screen to install the software. When the installation is complete, the converter will start automatically.

2. Add files to the program

Click the button Add files, select item Add audio and place the files you want in the program. The program supports many media formats, so the input file format can be almost any. Change the bitrate of audio files MP3, WMA, AAC and more. Try to lower the video bitrate: work with videos in AVI, MP4, DIVX and various HD video formats. The program will help you cope with a wide range of media conversion tasks!

3. Select the save format

Before changing the bitrate, you need to select the format in which your audio recordings will be saved. To do this, click on the tab Audio and select the appropriate format from the list. Having made a choice in favor of one or another audio format, click on its name and from the drop-down list select one of the available bitrate values \u200b\u200b(the option is not available for FLAC, OGG, WAV and M4A formats). If you do not want to change the standard bitrate value specified in the selected profile, you can skip the next step and proceed with the conversion.

4. Set the desired bitrate value

Click the gear button to the right of the field Output format... In the list Bit rate type select

Reliable and efficient HD screen recorder. Capture videos from programs, online broadcasts and even Skype conversations and save clips in any popular format, as well as for viewing on mobile devices.

It is customary to use the bit rate when measuring the effective transmission rate of a data stream over a channel, that is, the minimum channel size that can pass this stream without delays.

Bit rate is expressed in bits per second (bps, bps), as well as derived quantities with the prefixes kilo- (kbit / s, kbps), mega- (Mbps, Mbps) etc.

Data rate using bits per second of a block (symbol: "bps"), often used in conjunction with SI prefixes such as "kilo" (1 kbps \u003d 1024 bps) , “Mega” (1 Mbps \u003d 1024 kbps), “giga” (1 Gbps \u003d 1024 Mbps), or “tera” (1 Tbps \u003d 1024 Gbps). The non-standard abbreviation "bps" is often used to replace the standard character "bps", so for example "1 Mbps" is used to mean one million bits per second. One byte per second (1 Bps) corresponds to 8 bps.

Characteristics

In streaming video and audio formats (for example, MPEG and MP3) that use lossy compression, the bitrate parameter expresses the degree of compression of the stream and thereby determines the size of the channel for which the data stream is compressed. Most often, the bitrate of audio and video is measured in kilobits per second (eng. kilobit per second, kbps), less often - in megabits per second (only for video).

There are three compression modes for streaming data:

  • CBR (eng. Constant bitrate) - with constant bit rate;
  • VBR (eng. Variable bitrate) - with variable bit rate;
  • ABR (eng. Average bitrate) - with an average bit rate.

Information transfer rate

Physical layer of pure bit rate, baud rate, payload bitrate, payload frequency, net baud rate, coded transmission rate, effective baud rate or wire feed rate (unofficial language) digital channel communication is the ability without regard to the physical layer protocol overhead, for a multiplex example of time division multiplexing (TDM) framing bits, reserved forward error correction (FEC) codes, training symbol equalizer and other channel coding. Noise immunity codes are common, especially in wireless communication systems, broadband modem standards, or modern high-speed copper-based local area networks. The physical layer of the net bitrate is the data rate measured at a reference point at the interface between link layer and the physical layer, and therefore may include a data link as well as a layer load.

In modems and wireless systems, link adaptation (automatic adaptation of data rate and modulation and / or coding scheme errors, signal quality) is often applied. In this context, the term peak bitrate refers to the net bitrate of the fastest and least reliable transmission mode, used, for example, [when the distance is very short circuit] between the sender and the transmitter. Some operating systems and network equipment can detect the "connection speed" (unofficial language) of a particular network access technology or communication device, which assumes the current net data transfer rate. It should be noted that the term line rate is defined in some textbooks as gross bit rate and in others as pure data rate.

The relationship between aggregate bit rate and net data rate depends on the rate of the FEC code according to the following.

Constant bitrate

Constant bitrate - a variant of streaming data encoding, in which the user initially sets the required bitrate, which does not change throughout the entire file.

Its main advantage is the ability to fairly accurately predict the size of the final file.

However, the constant bitrate option is not very suitable for musical works, the sound of which changes dynamically over time, since it does not provide an optimal size / quality ratio.

Variable bitrate

FROM variable bitrate the codec selects the bitrate value based on the parameters (the level of the desired quality), and during the encoded fragment, the bitrate can change. When compressing audio, the desired bit rate is determined based on the psychoacoustic model. This method gives the best quality / size ratio of the output file, but the exact size is very poorly predictable. Depending on the nature of the sound (or image, in the case of video encoding), the size of the resulting file may differ several times.

Average bitrate

Average bitrate is a hybrid of constant and variable bitrates: the value in kbit / s is set by the user, and the program varies it within certain limits. However, unlike VBR, the codec is careful to use the maximum and minimum possible values, without risking going beyond the user-specified average. This method allows the most flexible setting of the processing speed (for audio, it can be any number between 8 and 320 kbps, versus multiples of 16 in the CBR method) and with much greater (compared to VBR) accuracy in predicting the size of the output file.

MP3

A lossy MP3 audio compression format. Sound quality improves with increasing bit rate:

  • 32 kbps - Generally acceptable for speech only
  • 96 kbps - Typically used for low quality speech or streaming audio
  • 128 or 160 kbps - entry level music encoding
  • 192 kbps - acceptable music encoding quality
  • 256 kbps - high quality music encoding
  • 320 kbps - Highest encoding quality supported by MP3 standard

Other audio

  • 700 bps is the lowest bit rate used by the Codec2 open speech codec source code; voice is barely recognized, 1.2 kbps bit rate gives much better sound
  • 800 bit / s - the minimum required level for speech recognition, used in specialized speech codecs FS-1015
  • 2.15 kbps - minimum bitrate of the open source Speex codec
  • 6 kbps - minimum bitrate of the open source Opus codec
  • 8 kbps - telephone quality sound using speech codecs
  • - high quality digital audio format on DVD. DVD-Audio is not for video and is not the same as video discs

Here we will look at how to choose the right bitrate for your Internet broadcast. And so, Bitrate is the quality of the video. The higher it is, the higher the quality. If you make a high-quality stream with a great picture, you just need to increase the bitrate and that's it? No matter how it is. The stream then goes online, respectively, all this high bitrate is occupied by the Internet channel and it will be impossible to watch it. Therefore, you need to consider the possibilities of your Internet and the Internet of your audience. Not all have fiber optics. So it is not recommended to set the bitrate above 2 Mbps.

The second thing to pay attention to is the so-called bit / pixel ratio. This Formula looks simple:

bit / (pixels * frames)

What does this formula mean? Let's say we encode a stream with a resolution of 100px x 100px, at 25 fps (frames per second) and set the bitrate to 250 kbps (kilobits per second). So, for a second, a video with a size of 10,000 pixels (one hundred times a hundred) is allocated 25 frames and 250 kilobits. There are 10 kilobits (10000 bits) for each frame (250/25). We divide the bits allocated per frame by the size in pixels - we get the bit / pixel ratio - how much information is allocated to "encode" one pixel.

The more information is highlighted, the higher the quality.

In our example, the bit / pixel ratio is: (10,000 bits per frame) / (10,000 pixels) \u003d 1. It will be too much. Quite excellent quality can be obtained with an attitude 0,1 -0,15 ... For our example, a bitrate of ~ 32-35 kbps would be enough.

Let's calculate the approximate bit / pixel ratios for the most common resolutions:

720p: 1280 × 720 pixels:

  • Bitrate 1500kbps - 1500000 / ((1280 * 720) * 25) \u003d 1500000/23040000 \u003d 0.065
  • Bitrate 2500kbps - 2500000 / ((1280 * 720) * 25) \u003d 2500000/23040000 \u003d 0.109
  • Bitrate 3500kbps - 3500000 / ((1280 * 720) * 25) \u003d 3500000/23040000 \u003d 0.152

1080p: 1920 × 1080 pixels:

  • Bitrate 1500kbps - 1500000 / ((1920 * 1080) * 25) \u003d 1500000/51840000 \u003d 0.029 ( as you can see, the quality at the same bitrate will be about 2.5 times worse, so 1080p needs a higher bitrate than 720p)
  • Bitrate 5000kbps - 5,000,000 / ((1920 * 1080) * 25) \u003d 5,000,000/23040000 \u003d 0.096
  • Bitrate 7500kbps - 7500000 / ((1920 * 1080) * 25) \u003d 7500000/23040000 \u003d 0.145
  • Bitrate 10000kbps - 10000000 / ((1920 * 1080) * 25) \u003d 10000000/23040000 \u003d 0.192

What conclusions can be drawn? First, it is also the main thing, you cannot provide the resolution with the required bitrate - don’t try to stream. Do you want to stream anyway? Reduce either resolution or fps. Finish off a bit / pixel at least up to 0.075-0.1, or better, more.

Quality

Resolution

Video Bitrate,kbps

Audio Bitrate,kbps

FPS frames / sec

Video Codec

h.264profile

Audio Codec

Audio channel

240 p (426 x 240)

400 (300-700)

AAC or MP3

270p (480x270)

400 (300-700)

AAC or MP3

360p (640x360)

750 (400-1000)

AAC or MP3

480p (854x480)

1000 (500-2000)

AAC or MP3

540p (960x540)

1000 (800 - 2000)

AAC or MP3

Mono or
Stereo

720p (1280x720)

2500 (1560-4000)

AAC or MP3

Mono or
Stereo

720p (1280x720)

3800 (2500-6000)

AAC or MP3

Mono or
Stereo

1080p (1920x1080)

4500 (3000-6000)

AAC or MP3

Mono or
Stereo

1080p (1920x1080)

6800 (4500-9000)

AAC or MP3

Mono or
Stereo

1440p) (2560x1440)

9000 (6000-13000)

AAC or MP3

Mono or
Stereo

1440p (2560x1440)

13000 (9000-18000)

AAC or MP3

Mono or
Stereo

4K / 2160p (3840x2106)

23000 (13000-34000)

AAC or MP3

Mono or
Stereo

4K / 2160p (3840x2106)

35000 (20000-51000)

AAC or MP3

Mono or
Stereo



Home / Instructions / Selecting the bitrate for the stream

Note: For a better understanding of the text below, I highly recommend that you familiarize yourself with the basics of digital sound.

    S: The higher the bitrate, the better the track

    R: This is not always the case. First, let me remind you what bitrey is t (bitrate, not bitraid). In fact, this is the data rate in kilobits per second during playback. That is, if we take the track size in kilobits and divide by its duration in seconds, we get its bitrate - the so-called. file-based bitrate (FBR), usually it does not differ too much from the bitrate of the audio stream (the reason for the differences is the presence of metadata in the track - tags, "embedded" images, etc.).

    Now let's take an example: the bitrate of uncompressed PCM audio recorded on a regular Audio CD is calculated as follows: 2 (channels) * 16 (bits per sample) * 44100 (samples per second) \u003d 1411200 (bps) \u003d 1411.2 kbps ... Now let's take and compress the track with any lossless codec ("lossless" - "lossless", that is, one that does not lead to the loss of any data), for example, the FLAC codec. As a result, we will get a bitrate lower than the original one, but the quality will remain unchanged - here's your first refutation.

    Something else is worth adding here. The output bitrate with lossless compression can be very different (but, as a rule, it is less than that of uncompressed audio) - it depends on the complexity of the compressed signal, or rather on data redundancy. Thus, simpler signals will compress better (i.e. we have a smaller file size for the same duration \u003d\u003e lower bit rate), and more complex signals will be worse. That is why lossless classical music has a lower bit rate than, say, rock. But it should be emphasized that the bit rate here is by no means an indicator of the quality of the sound material.

    Now let's talk about lossy (lossy) compression. First of all, you need to understand that there are many different encoders and formats, and even within the same format, the encoding quality for different encoders may differ (for example, QuickTime AAC encodes much better than the outdated FAAC), not to mention the superiority of modern formats (OGG Vorbis, AAC , Opus) over MP3. Simply put, out of two identical tracks encoded by different encoders with the same bitrate, some will sound better, and some will sound worse.

    In addition, there is such a thing as upconvert... That is, you can take a track in MP3 format with a bitrate of 96 kbps and convert it to MP3 320 kbps. Not only will the quality not improve (after all, the data lost during the previous 96 kbit / s encoding cannot be returned), it will even worsen. It is worth pointing out that at each stage of lossy encoding (with any bit rate and any encoder), a certain amount of distortion is introduced into the audio.

    And even more. There is one more nuance. If, say, the bitrate of an audio stream is 320 kbps, this does not mean that all 320 kbps were spent encoding that very second. This is typical for coding with constant bitrate and for those cases where a person, hoping to get the maximum, the quality forces a too high constant bitrate (for example, setting 512 kbps CBR for Nero AAC). As you know, the number of bits allocated to a particular frame is regulated by the psychoacoustic model. But in the case when the allocated amount is much lower than the set bitrate, even the reservoir of bits does not save (for the terms, see the article "What is CBR, ABR, VBR?") - as a result, we get useless "zero bits" that simply "finish off »Frame size to the required one (ie, increase the stream size to the specified one). By the way, this is easy to check - compress the resulting file with an archiver (better than 7z) and look at the compression ratio - the more it is, the more zero bits (since they lead to redundancy), the more wasted space.


    S: DVD-Audio sounds better than Audio CD (24 bit vs 16, 96 kHz vs 44.1, etc.)

    R: in principle, this is quite logical, and even partly true, but only people usually look only at numbers and very rarely think about the influence of one or another parameter.

    So, let's look at the bit depth first. This parameter is responsible for nothing more than the dynamic range, i.e. for the difference between the quietest and loudest sounds (in dB). In digital audio, the maximum level is 0 dBFS, and the minimum is limited by the noise level, that is, in fact, the dynamic range is equal to the noise level in absolute value. Is the dynamic range calculated as 20 * log (2 ^ 16) for 16-bit audio? 96.33 (dB). The dynamic range of a symphony orchestra is up to 75 dB (mostly about 40-50 dB).

    Now let's imagine the real conditions. The noise level in the room is about 40 dB (do not forget that dB is a relative value. In this case, the hearing threshold is taken as 0 dB), the maximum music volume reaches 110 dB (so that there is no discomfort) - we get a difference of 70 dB. Thus, it turns out that a dynamic range of more than 70 dB in this case is simply useless. That is, at a higher range, either loud sounds will reach the pain threshold, or quiet sounds will be absorbed by the surrounding noise. It is very difficult to achieve an ambient noise level of less than 15 dB (since the loudness of human breathing and other noises caused by the human factor is at this level), as a result, a range of 95 dB for listening to music is completely sufficient.

    Now about the sampling rate (sampling rate, sample rate). This parameter is responsible for the time sampling rate and directly affects the maximum signal frequency that can be described by this audio representation. By Kotelnikov's theorem, it is equal to half the sampling rate. That is, for a normal sampling frequency of 44100 Hz, the maximum frequency of the signal components is 22050 Hz. The maximum frequency. which is perceived by the human ear - just above 20,000 Hz (and even then, at birth; as we grow older, the threshold drops to 16,000 Hz).

    Read 24/192 Downloads - why they don't make sense.


    S: Different software players sound different (e.g. foobar2000 is better than Winamp, etc.)

    R: To understand why this is not so, you need to understand what a software player is. In fact, this is a decoder, handlers (optional), an output plugin (to one of the interfaces: ASIO, DirectSound, WASAPI. Etc.), and of course a GUI (user). Since the decoder in 99.9% of cases works according to the standard algorithm, and the output plug-in is just a part of the program that transmits a stream to the sound card through one of the interfaces, only handlers can be the reason for the differences. But the point is that handlers are usually disabled by default (or should be disabled, since the main thing for a good player is to be able to transmit sound in its "original" form). As a result, the subject of comparison here can only be capabilities processing and output, which, by the way, is very often unnecessary. But even if there is such a need, then this is already a comparison of handlers, not players.

    Here I would also like to mention my own and, perhaps, to upset users who admire the "colossal" changes in sound after the settings described in it - in 95% of cases this is self-hypnosis (except, of course, when during its setting some " enhancer "or another processor that spoils the whole picture). Sadly, the gains from all these tweaks with ReplayGain, resamplers and limiters are paltry. Conclusion: if you want really high-quality sound - buy yourself Hi-Fi acoustics and a professional sound card.


    S: Different driver versions sound different

    R: This statement is based on a banal ignorance of the principles of sound card operation. A driver is software that is required for a device to effectively communicate with the operating system, and usually provides a graphical user interface to control the device, its settings, etc. A sound card driver ensures that the sound card is recognized as sound, informs the OS about the formats supported by the card , provides the transfer of an uncompressed PCM (usually) stream to the card, and also gives access to the settings. In addition, in the case of software processing (by means of the CPU), the driver can contain various DSPs (handlers). Therefore, firstly, with disabled effects and processing, if the driver does not provide accurate transfer of PCM to the card, this is considered a gross error, a critical bug. And it happens rarely... On the other hand, the differences between the drivers can be in the update of processing algorithms (resamplers, effects), although this also happens very rarely. In addition, effects and any driver processing should still be disabled / bypassed to achieve the highest quality.

    Thus, driver updates are mainly focused on improving stability and fixing handling errors. In our case, neither the one nor the other affects the playback quality, therefore in 999 cases out of 1000 the driver does not affect the sound.


    S: Licensed Audio CDs sound better than their copies

    R: If during copying there were no (fatal) read / write errors and the optical drive of the device on which the copy will be played has no problems with reading it, then such a statement is erroneous and easily refuted.


    S: Stereo encoding mode gives better quality than Joint Stereo

    R: This misconception mainly concerns LAME MP3, as all modern encoders (AAC, Vorbis, Musepack) use onlyjoint Stereo mode (and this already says something)

    For a start, it's worth mentioning that Joint Stereo mode is successfully used with lossless compression. Its essence lies in the fact that the signal before encoding is decomposed into the sum of the right and left channels (Mid) and their difference (Side), and then these signals are separately encoded. In the limit (for the same information in the right and left channels), double data savings are obtained. And since in most music the information in the right and left channels is quite similar, this method is very effective and allows you to significantly increase the compression ratio.

    In lossy, the principle is the same. But here, in the constant bitrate mode, the quality of fragments with similar information in two channels will increase (in the limit, it will double), and for the VBR mode in such places, the bitrate will simply decrease (do not forget that the main task of the VBR mode is to stably maintain the specified encoding quality, using the lowest possible bitrate). Since during lossy encoding priority (in the allocation of bits) is given to the sum of the channels to avoid degradation of the stereo panorama, dynamic switching between Joint Stereo (Mid / Side) and normal (Left / Right) frame-based stereo modes is used. By the way, the reason for this delusion was the imperfection of the switching algorithm in older versions of LAME, as well as the presence of the Forced Joint mode, in which there is no auto-switching. In recent versions of LAME, the Joint mode is enabled by default and it is not recommended to change it.


    S: The wider the spectrum, the better the recording (about spectrograms, auCDtect and frequency range)

    R: In our time on the forums, unfortunately, it is very common to measure the quality of a track with a "ruler on the spectrogram". Obviously, because of the simplicity of this method. But, as practice shows, in reality everything is much more complicated.

    And the point is this. The spectrogram visually demonstrates the distribution of signal power over frequencies, but it cannot give a complete picture of the sound of the recording, the presence of distortions and compression artifacts in it. That is, in fact, all that can be determined from the spectrogram is the frequency range (and partially - the spectrum density in the HF region). That is, at best, by analyzing the spectrogram, you can identify the upconvert. Comparison of the spectrograms of tracks obtained by coding by various encoders with the original is sheer absurdity. Yes, you can identify differences in the spectrum, but it is almost impossible to determine whether they will be perceived (and to what extent) by the human ear. We must not forget that the task of lossy encoding is to provide an indistinguishable result human ear from the original (not with the eye).

    The same applies to assessing the quality of encoding by analyzing the output tracks with the auCDtect program (Audiochecker, auCDtect Task Manager, Tau Analyzer, fooCDtect are just shells for the one-of-a-kind auCDtect console program). The auCDtect algorithm also actually analyzes the frequency range and only allows you to determine (with a certain degree of probability) whether MPEG compression was applied at any of the encoding stages. The algorithm is tailored for MP3, so it is easy to "cheat" using Vorbis, AAC and Musepack codecs, so even if the program writes "100% CDDA" it does not mean that the encoded audio is 100% identical to the original one.

    And, returning directly to the spectra. Also popular is the desire of some "enthusiasts" to turn off the lowpass filter in the LAME encoder at all costs. There is a lack of understanding of the principles of coding and psychoacoustics. First, the encoder cuts high frequencies with only one purpose - to save data and use it to encode the most audible frequency range. The extended frequency range can be fatal to the overall sound quality and lead to audible encoding artifacts. Moreover, turning off the cutoff at 20 kHz is generally completely unjustified, since a person simply does not hear frequencies above.


    S: There is a kind of "magic" EQ preset that can significantly improve the sound

    R: This is not entirely true, firstly, because each separately taken configuration (headphones, acoustics, sound card) has its own parameters (in particular, its amplitude-frequency characteristic). And therefore, each configuration should have its own, unique approach. In simple terms, such an EQ preset exists, but it is different for different configurations. Its essence lies in adjusting the frequency response of the path, namely, in "leveling" unwanted dips and surges.

    Also, among people far from directly working with sound, setting the graphic equalizer with a "tick" is very popular, which actually represents an increase in the level of low and high frequency components, but at the same time leads to muffling of vocals and instruments, the sound spectrum of which is in the midrange ...


    S: Before converting music to another format, you should "decompress" it to WAV

    R: I'd like to point out right away that WAV means PCM data (pulse code modulation) in a WAVE container (file with * .wav extension). This data is nothing more than a sequence of bits (zeros and ones) in groups of 16, 24 or 32 (depending on the bit width), each of which is a binary code of the amplitude of the corresponding sample (for example, for 16 bits in decimal notation these are values \u200b\u200bfrom -32768 to +32768).

    So, the fact is that any sound processor - be it a filter or an encoder - usually works only with these values, that is only with uncompressed data. This means that to convert audio from, say, FLAC to APE, simply necessary decode FLAC to PCM first and then encode PCM to APE. It's like repackaging files from ZIP to RAR, you must first unpack the ZIP.

    However, if you are using a converter or just an advanced console encoder, the intermediate conversion to PCM happens on the fly, sometimes even without writing to a temporary WAV file. This is what misleads people - it seems that the formats are converted directly to one another, but in fact, such a program necessarily has an input format decoder that performs an intermediate conversion to PCM.

    Thus, manual conversion to WAV will give you absolutely nothing but wasting time.


The MP3 file format is referred to as “ open format»Supported by most manufacturers.

The MP3 format is one of the most common digital audio encoding formats. A feature of MP3 audio encoding is lossy encoding. However, the coding is based on a special model that takes into account the peculiarities of auditory perception. Therefore, the presence of losses does not lead to catastrophic degradation of sound.

MP3 files have become the de facto standard, and playback is supported by most of the popular operating systems, many CD and DVD players and other devices.

Interestingly, the standard describes the actual storage format, and not the way the sound is encoded. As a result, there are a lot of tools available for playing MP3 audio.

Special codecs are used to encode audio in MP3 format.
An audio codec can be one of two types - hardware codec and software codec.

Hardware coding is performed using special microcircuits.
Software coding is performed using special computer programs.

The audio quality in MP3 format (all other things being equal) depends on the compression ratio (read the amount of loss) and on the encoding program. That is why branded players using codecs and audio signal processing systems from well-known brands are significantly superior in playback quality to conventional devices assembled from standard assemblies.

The quality of the actual playback depends on the size of the data stream from the media. The amount of data stream is sometimes referred to as stream width. There is a special term - bitrate. The data flow rate is defined in kilobits per second and is denoted kbs, kbps, kb / s. The recording can be encoded in several ways - with constant bit rate and with variable bit rate. Variable bitrate helps preserve detail by increasing the amount of data.

For high-quality music playback not all data rates are suitable see Table 1

MP3 bit rates and applications

Table 1

The data shown in Table 1 can only serve as a guide. The fact is that at the time the MP3 format appeared, the quality of mass-market audio equipment was not very high. Many reputable publications have seriously argued that a data stream of 128 kb / s is sufficient for high-quality sound reproduction.

Currently, a bitrate of at least 192 kb / s is considered to be of high quality. Moreover, the widespread adoption of Hi-Fi, Hi-End and home theater systems has led to a massive shift towards high quality audio reproduction.

Therefore, the flaws in sound reproduction, imperceptible on the budget equipment of the past, become noticeable to the "unprepared listener" who uses modern high-quality technology. By the way, the level of this very "unprepared listener" has grown significantly.

In general, the idea of \u200b\u200bcompression (and especially lossy compression) is gradually becoming obsolete. Appearing in the era expensive storage media and low bandwidth of data transmission channels, the idea of \u200b\u200bdata compression did its job perfectly. However, sound lovers are gradually switching to higher bitrates (compression with lower losses), or even to “lossless” or even no compression formats.

The practicality of compressed formats, and the MP3 format in particular, led to the release of compact MP3-players, arranged on memory chips or on miniature hard drives.

When choosing this or that model of such a player, a question arises related to the amount of its memory. Naturally, the user wants to estimate in advance the amount of musical material that he can save at a time on his MP3 player.

Approximate data on the volume of files and duration of the sound are collected in Table 2. When using Table 2, it should be borne in mind that these are approximate data that allow you to estimate the required memory capacity of players or removable media.

MP3 length and compression ratio

table 2

Bitrate,
kb / s

1 minute recording,
KB

Standard
3 minute composition,
MB

Standard
4 minute composition,
MB

Standard
5 minute composition,
MB

Note to Table 2
High degree compression corresponds to 56 kb / s, low compression and high sound quality corresponds to 320 kb / s

Table 3 provides an estimate of the total length of music recordings - playing time of a player with a certain amount of memory.

The total playing time of the MP3 player depending on the amount of memory

Table 3

Duration of sound

Memory size,
GB

Bitrate, kb / s

Minutes
Hours

Minutes
Hours

Minutes
Hours

Minutes
Hours

Minutes
Hours

Minutes
Hours

As far as can be judged from Table 3, the capacity of 8 GB is sufficient to store records in MP3 format of the highest quality in an amount suitable for listening for 8 hours every day for a week (7 days). No repetitions! Hardly anyone really has such a need.

Even so, you can update the records on the player no more than once a week.

2013 site. All rights reserved.

See you on the Web!

How to turn on Wi-Fi on an Asus laptop
Vista - how to free up disk space
MTS 3G modem connection
Size and quality of MP3 files
Methodology and practice of choosing an MP3 player, part 1
Methodology and practice of choosing an MP3 player part 2
Vista - how to open command prompt in a folder
How to print filenames from a folder
How to save filenames as text
How to copy filenames in MS Excel
LED USB Flashlight
Headphone device and design
How to choose headphones?
How do the inserts hold?
Power for gadgets - batteries
SIM card sizes
Power gadgets - plugs and sockets
Power supply for gadgets - adapters