One small step for ffmpeg... one giant leap for Emby! (Looking for C developer(s) to help with transcode throttling)

February 25, 2015

Bonus points if you're familiar with with the ffmpeg source (MAJOR bonus points if you've successfully built it! ).

[i thought about posting this in the dev section, but I figured there may be some developers out there who have thought about getting involved but have held off for one reason or another (much like me). If you have some experience with C, this would be a great way to dip your toes in the water. Phase 1 won't require much time; just looking at some code and a bit of brainstorming. The hard stuff comes later. Anyway...]

At this point in the game, I think it's safe to say that the only area MB3 is really lagging behind Plex is when it comes to transcoding, specifically the ability to intelligently throttle the process. Getting this working would be a huge leap forward for MB and would solidify its position as best-in-class (IMO). I'd like to make that happen.

I spoke with Luke a bit regarding the direction we need to go to make this happen. I have a good understanding at a high level, but digging into the ffmpeg source made it clear that this should be more of a group effort. So I'd like to put together a group to take this on. How much you contribute is entirely up to you. To begin with, just having one or two people familiar with C who are willing to take a look at the code, answer some questions, and maybe help put together a solution approach would be great.

If you're interested, reply to the thread or PM me.

February 25, 2015

How about an overview of where we stand and what needs to happen?

February 25, 2015

Okay, here's a quick overview. As I understand it, Plex uses a custom build of ffmpeg to achieve control over the speed of transcoding. When MB attempted to implement this, rather than bother with ffmpeg, they passed an API endpoint to ffmpeg and used that to control how fast data was fed in. This worked fine on most machines, but struggled on lower powered ones. There were some other less severe drawbacks as well.

So, the "right" way to do this is to just bite the bullet and create a custom build of ffmpeg. ffmpeg currently has a flag "-re" that tells it to read the input at its native framerate (by default it reads it as fast as it can get it). So somewhere there is code that controls the speed at which the input is read. We'd want to use that as a springboard for some new methods that allow us to specify the speed at which it is read (rather than either "fullspeed" or "native"). So how would we indicate, in real time, how fast we want it to process? Well, ffmpeg also currently accepts keypresses over stdin while processing. The only one I know of is "q", which quits the process. But, again, if we can piggyback off the existing code and expand it to accept other keys (say 0-9) we should be able to control this while the process is running.

I've managed to clone the ffmpeg source into Eclipse but I'm far from being able to build. It's a bit of a pain, to put it mildly. I've been poking around trying to narrow down where this functionality is located, but my unfamiliarity with C is making it a very slow process. Not to mention it is a large and complex program. And that's where I'm hoping some others will come in. If we can collaborate on this, I think it'll go much faster and have a greater chance of success.

February 25, 2015

I am not familiar with the code base, but here is my thinking, in theory of course:

- find the code that turns on/off "read input at native framerate", which is controlled by -re

- add methods to allow it to be changed during encoding

- add key inputs to call those methods

Once those areas are located, I don't think it will be difficult at all. The real fun comes after in trying to build it

February 25, 2015

Wouldn't a potentially simpler method be to "pause" ffmpeg? You'd have to make sure that you actually kill the process when playback stopped or after a set amount of time with no update otherwise it could get bad.

My thought is that you would then set ffmpeg to encode as fast as it can for a preset amount of time of the video (not real time), say 5 minutes. Playback now has 5 minutes of video to stream. At 2.5 minutes in, or when seek is used, the server requests ffmpeg to unpause and encode another 5 minute chunk.

This would require less ffmpeg customization (I think) and would make it so that the CPU can go to a low power state quickly after the encode (this method would be much more power efficient then limiting the FPS of ffmpeg). The added benefit of course is that a lot of encoding isn't wasted if you don't watch the entire video and there should be less CPU strain with multiple streams.

If pausing keeps everything in memory and is not a beneficial route then perhaps closing ffmpeg and reopening it at each interval would be better. But then you'd need to track exactly where it needs to start from in the file.

February 25, 2015

Wouldn't a potentially simpler method be to "pause" ffmpeg? You'd have to make sure that you actually kill the process when playback stopped or after a set amount of time with no update otherwise it could get bad.

As I've been looking at the code, I've been trying to think of another way to throttle the processing thread. One option would be to just slow down the processing loop. Generally, if I see Thread.Sleep(x) it's an indicator of poor design. The main problem is it's not at all consistent. 'x' is really the minimum to sleep for and the actual could be orders of magnitude higher. What you're really doing is yielding to other threads and when your thread comes back depends on what else is going on. Then again we don't necessarily care about millisecond precision in this scenario, so maybe it would be an option? Just thinking out loud here...

February 25, 2015

As I've been looking at the code, I've been trying to think of another way to throttle the processing thread. One option would be to just slow down the processing loop. Generally, if I see Thread.Sleep(x) it's an indicator of poor design. The main problem is it's not at all consistent. 'x' is really the minimum to sleep for and the actual could be orders of magnitude higher. What you're really doing is yielding to other threads and when your thread comes back depends on what else is going on. Then again we don't necessarily care about millisecond precision in this scenario, so maybe it would be an option? Just thinking out loud here...

I don't think you need to get too creative because at the ffmpeg level it doesn't need to be smart. It just needs commands so that it can be managed by a smart controller.

So however you want to do it, we could make it work from MBS, such as commands to flip on/off the -re option, and/or commands to set a speed level based on a numeric value.

February 25, 2015

@@jose and @@gsnerf this project might be right up your alley

February 25, 2015

Also thinking aloud here, I think the best possible end game would be making this as general as possible and submitting a pull request to ffmpeg. If this could make it into the official build that would be a huge relief for our package builders because then they wouldn't have to add this to their build process.

February 25, 2015

I'm sure if this worked we probably wouldn't be having this this conversation, but is it not possible to just suspend the ffmpeg thread in MBS using something like NativeMethods.SuspendThread()?

February 25, 2015

is there a mono version of that method or non-windows equivalents? i think it is good to keep the encoding going because at low speed the cpu utilization is negligible.

however, SuspendThread could be a quick win if it will work multi-platform.

February 25, 2015

That I don't know. Perhaps @@cayars might be able to chime in here...

Edit: SuspendThread comes from kernel32.dll. It looks like Mono.Unix.UnixProcess.Signal *might* be the equivalent (using SIGSTOP and SIGCONT). I'm not familiar with Mono (or unix/linux), though, so take this with a giant grain of salt.

Edited February 25, 2015 by jluce50

February 25, 2015

As I've been looking at the code, I've been trying to think of another way to throttle the processing thread. One option would be to just slow down the processing loop. Generally, if I see Thread.Sleep(x) it's an indicator of poor design. The main problem is it's not at all consistent. 'x' is really the minimum to sleep for and the actual could be orders of magnitude higher. What you're really doing is yielding to other threads and when your thread comes back depends on what else is going on. Then again we don't necessarily care about millisecond precision in this scenario, so maybe it would be an option? Just thinking out loud here...

I didn't mean a ffmpeg sleep timer. I meant that we would request ffmpeg encode and either have it wait for a request to continue once it reaches a specific length (5 minutes) or if that is not feasible then to just request ffmpeg only encode 5 minutes of video, terminate and then just request another 5 minute block in 2-3 minutes time (handled by MBS).

February 25, 2015

I didn't mean a ffmpeg sleep timer. I meant that we would request ffmpeg encode and either have it wait for a request to continue once it reaches a specific length (5 minutes) or if that is not feasible then to just request ffmpeg only encode 5 minutes of video, terminate and then just request another 5 minute block in 2-3 minutes time (handled by MBS).

Right, I got that. I was just using what you said as a springboard to my own thoughts on the matter. I would personally lean toward speeding up/slowing down ffmpeg, rather than killing and restarting the process (seems less complex). It's still early though, so that approach is definitely worth considering.

February 25, 2015

Right, I got that. I was just using what you said as a springboard to my own thoughts on the matter. I would personally lean toward speeding up/slowing down ffmpeg, rather than killing and restarting the process (seems less complex). It's still early though, so that approach is definitely worth considering.

The problem that I have with slowing down ffmpeg is that it's not efficient at all. I also don't see how it will benefit anyone really.

If thread 1 is running at 200fps and using 90% CPU and a second ffmpeg thread is started I believe thread 1 will slow down won't it? Or will it continue at the current speed and thread 2 will be severely limited? That would be a big thing to find out. if thread 1 slows down then what does artificially slowing it down before a second thread exists accomplish?

My suggestion is to stop/start because CPU idle states are much much more power efficient then them running at 50% or something other then one of the 3 idle states (in Intel CPUs). Also, by stopping/starting it allows us to save the encoding of the video that never gets watched (end of credits, stop because you have to do something, don't like the movie, etc.)

Also, assuming that the second thread doesn't start at the same time, or on an interval, there will be no conflict between them and the CPU load would be less then now. Since each chunk is much smaller then the current method of encoding the entire thing at once it is more likely to not have both running at the same time. Obviously the benefits get a little less the more threads that concurrently running but there is still a benefit to this method.

Unfortunately I don't know C so I can't help with the project but I can certainly help on a logic standpoint and try and help figure out the best way to accomplish a better transcoding method.

February 25, 2015

The problem that I have with slowing down ffmpeg is that it's not efficient at all. I also don't see how it will benefit anyone really.

I suppose I could be wrong, but throttling the processing via slowing down the loop or limiting the read framerate of ffmpeg should be much more efficient than the current functionality. It may not allow the cpu to enter idle state, but it's still going to improve the user's experience.

If thread 1 is running at 200fps and using 90% CPU and a second ffmpeg thread is started I believe thread 1 will slow down won't it? Or will it continue at the current speed and thread 2 will be severely limited? That would be a big thing to find out. if thread 1 slows down then what does artificially slowing it down before a second thread exists accomplish?

I think it has to do with how much it slows down. If thread 1 gets way ahead of the user's current position, it could slow down to, say, 10% of the normal fps. Whereas currently both threads will (theoretically) go to 50%, which might severely affect thread 2 because that user just started their stream and needs all the fps they can get.

My suggestion is to stop/start because CPU idle states are much much more power efficient then them running at 50% or something other then one of the 3 idle states (in Intel CPUs). Also, by stopping/starting it allows us to save the encoding of the video that never gets watched (end of credits, stop because you have to do something, don't like the movie, etc.)

Depending on how this is controlled in MBS, we could still save an a lot of unnecessary encoding (e.g. if it generally tries to stay 5 minutes ahead of the users current position before throttling down).

Also, assuming that the second thread doesn't start at the same time, or on an interval, there will be no conflict between them and the CPU load would be less then now. Since each chunk is much smaller then the current method of encoding the entire thing at once it is more likely to not have both running at the same time. Obviously the benefits get a little less the more threads that concurrently running but there is still a benefit to this method.

Unless they're timed perfectly, there would still be some overlap, though. With the throttling approach, their relative processing needs can be taken into account rather than just letting the two threads duke it out when one thread might have a lot less cushion than the other.

These are all just my thoughts and I'm not married to any particular approach. You clearly have a lot more experience with the inner workings of MB, so factor that in as you read my take on all this...

Unfortunately I don't know C so I can't help with the project but I can certainly help on a logic standpoint and try and help figure out the best way to accomplish a better transcoding method.

No worries. The more input we get, the better!

February 25, 2015

Hm, some good points.

I don't think there is a benefit to restricting the fps even if there are multiple threads with my suggestion. If the CPU is being taxed and can't keep up then slowing down the fps doesn't help. On a low end CPU restricting the fps is going to drastically increase the time ffmpeg is running and increases the likelihood that another thread will need the CPU.

Even on low end CPUs I think my suggestion is better.

We probably need to get some numbers (fps, encode times, CPU %) from a low end CPU and run some numbers and see theoretically how the methods stack up.

February 25, 2015

We probably need to get some numbers (fps, encode times, CPU %) from a low end CPU and run some numbers and see theoretically how the methods stack up.

Now that I totally agree with. Perhaps some combination of the two would be a viable option...

February 25, 2015

I tried to think of some combination. I just couldn't think of a way that was beneficial. Let's get some numbers from someone and then I can try write a formula.

February 25, 2015

I myself would not do it this way but instead would keep it simple. I would use the ffmpeg commands:

-ss The offset time from beginning. Is a value in seconds or hours:minutes:seconds.milliseconds, as in 00:02:00.00.

-t Duration. Can also accept a value in seconds, or hours:minutes:seconds.milliseconds, as in 00:00:10.00.

No changes to ffmpeg needed.

Using the above two command line arguments you can for example create 10 second chunks or segments. The server software then just streams them in order.

So by doing it this way you can stop/pause at any 10 second segment and jump 3/4ers (whatever) the way through the movie if someone FF and just start from there. Using chunked/segmented encodes then makes it very easy to do things like process 2 minutes at a clip and make sure you are always 30 seconds a head of the client.

These should both be simple changes I'd think but haven't checked. This would surely make control of CPU use during transcoding very efficient and would lend itself really well to any distributed encoding techniques since multiple computers can work on the same video at the same time if needed. Think h.265 transcoding which is really hard on the cpu.

Carlo

Edited February 25, 2015 by cayars

February 25, 2015

That sounds similar to what I was suggesting Cayars. With the added benefit of actually knowing the commands in ffmpeg

So by doing what you suggest we wouldn't need to change ffmpeg at all. All that would need to happen is that MBS limits the duration to a fixed amount and when the client reports that it is getting with X amount of time to the end of the segment it starts encoding the next (or maybe it is 1-2 chunks ahead just in case)

February 25, 2015

it's not that easy i'm afraid because the seek isn't guaranteed to be exact.

February 25, 2015

it's not that easy i'm afraid because the seek isn't guaranteed to be exact.

It never is is it

Couldn't the seek kick off a new encode though? It would then take 5 minute (or whatever chunk size is used) chunks from that point on? The server would still know each offset to request based on the start of the seek. Or it could compensate for the difference in the first requested chunk as a one time request and go back to even chunks. Whichever is easier.

February 25, 2015

Works well enough for Plex.

Technically it doesn't have to be exact. If it was only "close enough" to the key frames and 3 frames or so were skipped or repeated you would not even notice. Matter of fact many "transcoders" skip frames anyway in order to work faster.

But that is beside the point. It will be much more accurate then you think it will.

Anyone can do an easy experiment. Grab a movie file and use the commands above to create 5 seconds chunks in order for say 5 minutes of your movie.

Now combine them back again.

Now watch the 5 minute segment and see how smooth it is. I bet you can't tell where the chops took place.

February 25, 2015

Works well enough for Plex.

Technically it doesn't have to be exact. If it was only "close enough" to the key frames and 3 frames or so were skipped or repeated you would not even notice. Matter of fact many "transcoders" skip frames anyway in order to work faster.

But that is beside the point. It will be much more accurate then you think it will.

Anyone can do an easy experiment. Grab a movie file and use the commands above to create 5 seconds chunks in order for say 5 minutes of your movie.

Now combine them back again.

Now watch the 5 minute segment and see how smooth it is. I bet you can't tell where the chops took place.

are you really sure that's what they do? if so then that would mean you can't throttle when the media is unseekable or you don't know the duration (e.g., internet channel content).

One small step for ffmpeg... one giant leap for Emby! (Looking for C developer(s) to help with transcode throttling)

Recommended Posts

jluce50 118

Link to comment

Share on other sites

Carlo 4330

Link to comment

Share on other sites

jluce50 118

Link to comment

Share on other sites

Luke 37065

Link to comment

Share on other sites

techywarrior 688

Link to comment

Share on other sites

jluce50 118

Link to comment

Share on other sites

Luke 37065

Link to comment

Share on other sites

Luke 37065

Link to comment

Share on other sites

Luke 37065

Link to comment

Share on other sites

jluce50 118

Link to comment

Share on other sites

Luke 37065

Link to comment

Share on other sites

jluce50 118

Link to comment

Share on other sites

techywarrior 688

Link to comment

Share on other sites

jluce50 118

Link to comment

Share on other sites

techywarrior 688

Link to comment

Share on other sites

jluce50 118

Link to comment

Share on other sites

techywarrior 688

Link to comment

Share on other sites

jluce50 118

Link to comment

Share on other sites

techywarrior 688

Link to comment

Share on other sites

Carlo 4330

Link to comment

Share on other sites

techywarrior 688

Link to comment

Share on other sites

Luke 37065

Link to comment

Share on other sites

techywarrior 688

Link to comment

Share on other sites

Carlo 4330

Link to comment

Share on other sites

Luke 37065

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in