resources JS vs actors
-
Hi Dill,
Recently I have looked at a comparison between JSON, Javascript and other native Isadora actors to run point cloud particle systems. Basically the implementation has been about the speed of reading coordinate numbers from an external file - for example reading in XYZ data as many thousands of coordinate number range triplets and then separating and routing these number ranges using either JSON reader/JSON Parser, Data Array/Javascript, or other native Isadora actors like Selector and Router.
In fact all methods were workable and in the end I could not tell you if one method provided any more efficiency than another. All had very low overhead in terms of the process of reading and routing the coordinate data.
Best Wishes
Russell
-
When we first added the Javascript actors I did some large testing patches where I programmed the same algorithm as used in the 'Smoother' actor in Javascript. (includes basic arithmetic)
I found that I could create thousands of copies of this JS user actor continuously processing incoming values before it began to add any real load to the CPU. The native Smoother actor did however allow for double as many.
At that level I really don't think JS has much more overhead, its very fast at math and I suspect also at String operations.
There is a benefit in minimizing the number of actors used in your user actors. Each actor adds size to the User Actor, and each copy of the user actor used in your Isadora file adds to the file size of your IZZ file.
Ideally, I would suggest creating an user actor using JS and one using the native actors, save each to your desktop, and compare the file size. This may be the metric that is more important.
NOTE: your Javascript code makes your patch larger, so less comments etc.. will help decrease its size. Even variable names will have an impact if you are making many copies. -
You have to cross the boundary of C++ to JavaScript. That is an expensive operation.
Personally, I would bundle all the operations that I need (that I normally create with the Actors) inside the JavaScript code, so that for example 10x actors you only need 1 JavaScript actor.
Also be aware that Actors in general are more readable if you give it to an other tech to run the show, comment your JavaScript code since in 1 year when you have to add something you wont know how it works
-
@juriaan saw this on a unity forum earlier today and they were struggling to implement it something about compatibility dependants , would this be do able in Izzy ? Please excuse my coding ignorance if I’m totally off the mark, I just thought that it would be an amazing actor . https://blog.tensorflow.org/2021/05/next-generation-pose-detection-with-movenet-and-tensorflowjs.html?m=
-
Thank you for your opinions and advises!
I registered a much more resource hungry part of the patch in the generative actors 'text draw' and 'shapes'. They are very heavy on CPU, which I didn't expect.
I recognised as well, that Isadora seem to use just one CPU Core for this. While the 'Activity Monitor' (I'm on MacOS) shows significant more activity on all threads, running the patch, The Isadora Process peaks to 100% only (Seems ok in the first place, but as far as I know, each core/thread get's 100%. Means, an 8Core CPU could get up to 800%, resp. 1600% with all threads used). While the CPU History barely shows full load on any of the threads, Isadoras Load is shooting the moon fast.
In the attached screenshot I pushed the boundries on purpose and set the preferences default resolution to UHD size. I'm ok with very low resolutions, as I only need some numbers and symbols. Therefore it isn't crushial for the project. But still, it seems a bit odd to me. I probably don't know enough about the system behind the isadora engine, but maybe there is a solution I don't see. -
This is not possible with Javascript in Isadora since this code relies on browser specific capabilities that are not available in the pure Javascript V8 engine used in Isadora.
-
@dillthekraut said:
Isadora Process peaks to 100%
This is likely due to Text Draw. Drawing text is a CPU heavy process. I generally throttle down text drawing to a rate of 1/2 or 1/4 the frame rate.
Additionally, you can save a LOT of cpu by both rendering your Text at a reduced rate as I mentioned, but also to a smaller rectangle that the full stage. This is especially critical if you are going to be animating the position of the text. It is way better to render your text to a say 300w x 30h rectangle and then place that rect via the Matte++ actor to allow location compositing. In this way you can have the animation running at full Framerate, and only draw the text (to the small rect) when the text changes and/or at the fraction of the framerate. -
I have made a quick demo that shows an optimized method of drawing text while animating it. Of course this may not be exactly what you need. Isadora being so flexible really offers so many options, but hopefully it helps to make the point about pre-rendering text into bit size graphic rectangles before placing/modifying them.
Text Draw will render higher quality modifications (scaling, rotation etc..) but at the cost of more cpu, finding a balance between performance and quality is the trick.
But 100% position is better done outside Text Draw (as in the Demo file.) -
@dusx I knew that I had over simplified the process as soon as I had posted .Oh well thanks. Sorry for gate crashing this thread , the possibility of it working had made a little me over excited .
-
@n-jones said:
I knew that I had over simplified the process as soon as I had posted .Oh well thanks. Sorry for gate crashing this thread , the possibility of it working had made a little me over excited
Just know that if I were only coding Isadora for myself, and not taking the user base in to account, getting that kind of pose detection into the program would be where I spent every second of my time. It is possible by the way, but just not "out of the box" using Javascript for the reasons @DusX mentioned. It's possible to move the tensor flow stuff into C++ I reckon, but it is something I have zero experience with.
Best Wishes,
Mark -
@dusx said:
This is likely due to Text Draw. Drawing text is a CPU heavy process.
Just to expand on the "why" of this, every character you draw is specified by a bunch of bezier curves, e.g.:
When you draw text, the CPU (not the faster GPU) has to figure out where to draw all the pixels based on that specification, i.e., inside the curve is dark and outside is light. Now imagine the computer doing that for all the letters in a long sentence. You can see why that would take a lot of CPU power.
It's especially bad when the resolution is high... again, because this happens on the CPU not the GPU. There are thousands upon thousands of pixels to fill in based on the shape of the letter.
That's why @DusX's example is very important to pay attention to.
My 'best practice' goes like this: set the 'font size' input so that the text fills the entire frame that is output by the Text Draw actor. Then reduce the 'output width' and 'output height' inputs and scale the resulting image as needed using either the Projector actor or the Matte++ actor. Rule of thumb, if there's a lot of empty space in the frame output by the Text Draw actor, then you're not being as efficient as you could be.
Understanding how to keep the text resolution as low as possible while still giving you the quality you need is key to using the Text Draw actor effectively for the best performance.
Best Wishes,
Mark -
Thank you @mark and @DusX, for the deep dive into it!
I'll play around with the text draw actor to see what's best.
But still there stays my question, why my system isn't realy using the full potential.
I added some video delay actors, and again Isadoras LOAD is going up to 120%, while the overall CPU system stays at 90% idle and GPU is at around 25%.
I'm using a MacPro 5,1 2012 with Dual Xeon X5690, 12 Core and a Gforce 1080Ti under High Sierra.
Isadora seem to use multithreading, as the 117% CPU load for the Process, shown in the activity monitor, seems to be spread over several cores (see screenshot in previous post).
Why is Isadora struggling while there seem to be alot of potential recources unused? -
There is another question rising regarding performance.
My patch contains 40 text draw actors. If I put the half of it to stand by, it reduces the LOAD to a certain extend, but far from what it does, if I take them out completly (deleting).
The same goes for the video delay actors (did not try others).
Is this usual? I thought "stand by" would be comparable to "shut off" or "bypass" and kind of restore nearly 100% of the resources used by it? -
@dillthekraut said:
I added some video delay actors, and again Isadoras LOAD is going up to 120%, while the overall CPU system stays at 90% idle and GPU is at around 25%.
The video delay actors convert the GPU based image to a the CPU. I made this choice when designing the actor because of memory constraints on the typical video cards is simply less than the amount of RAM on a system.
For example, a five second delay at 30fps of 1920x1080 images requires 1.2 Gigabytes. If you add four of those delays, you've now run out of memory on the 4 GB GPU on the GPU on my relatively powerful Mac Book pro and Isadora crashes.
Given that most systems have so much more CPU memory than GPU memory, it seemed wise to make this choice. (I expect any professional level system to have 16 GB, but really a lot of folks now have 32GB or more.) Unfortunately there is a high cost performance wise when moving images from the GPU to the CPU. That's what you're seeing when you add those video delay actors.
If I put the half of it to stand by, it reduces the LOAD to a certain extend, but far from what it does, if I take them out completely
"stand by"? I don't know what you mean. Do you mean Pause Engine???
Best Wishes,
Mark -
@dillthekraut said:
LOAD is going up to 120%
LOAD is a measure of how much time is being used to process each frame, based on the target frame rate. It is NOT a measure of your system resource usage. In Isadora it is most important to know if the scene can process at the selected framerate, LOAD provides that, a measure of 100% means that calculating/rendering the frame is taking all the time available between each frames delivery. This will lead to dropped frames.
Isadora is both multi-threaded and single-threaded. Numerous processes including video playback are very multi-threaded. Video effects, mapping, compositing etc.. are massively multi-threaded due to the use of the GPU. The scene-graph (the calculations, routing etc..) you build within your scene are single-threaded.
-
Thank you for the explanation @mark, I suspected something like that.
@mark said:If I put the half of it to stand by, it reduces the LOAD to a certain extend, but far from what it does, if I take them out completely"stand by"? I don't know what you mean. Do you mean Pause Engine???
No, I just tried to find a word for comparison for what I understood and thought the "bypass" would work. It seems, that putting "bypass" to 'on' is not the same like "deactivate actor". What I expected was a full recovery of the resources the actor would consume while NOT 'bypassed'.
E.g. LOAD without the actor at all = 50%,
adding an actor 'bypass' off = actor is working = LOAD 80%,
set actor bypass to "on" = LOAD back to 50%,
But this isn't the case, instead it is like this:LOAD without the actor at all = 50%,
adding an actor with 'bypass' off = actor is working = LOAD 80%,
set actor bypass to "on" = LOAD 65% instead of expected 50%,
It is an example only. The numbers might be different. -
@dusx thank you for clearification. I'm aware of this. But still, shoudn't there be a connection between system recources and the LOAD (resp. possible framerate and cycles)?
My question here is, what is the bottleneck if the CPU and GPU are far from beeing stressed? As there isn't any video file playing and all content is generative only or comes from videocapture, it shoudn't be the Flashdrive (1500Mbit/s).
Is it maybe the BUS system where the Data between CPU, RAM and GPU are connected? Or maybe just the RAM itself?
Marks explanation about how the video delay works, could be explained by this. -
@dillthekraut said:
maybe the BUS system where the Data between CPU, RAM and GPU are connected?
Without looking at your file I can only guess.. but for sure one that is common is moving GPU data to the CPU (up to the GPU is fast).
If you would like me to take a deeper look, please feel free to open a support request, where I can then request a copy of your project file.
-
Did it. Thank you!
-
@dillthekraut said:
My question here is, what is the bottleneck if the CPU and GPU are far from beeing stressed? As there isn't any video file playing and all content is generative only or comes from videocapture, it shoudn't be the Flashdrive (1500Mbit/s).
The speed of the hard drive has nothing to do with this issue.
As I mentioned above about the Video Delay actor, it needs to move the image from the GPU to the CPU. Then I said "Unfortunately there is a high cost performance wise when moving images from the GPU to the CPU. That's what you're seeing when you add those video delay actors."
GPUs are designed to pull data from CPU RAM very very quickly. But they are not designed to go in the other direction. (Why is this? Because GPUs are designed for gaming, not for video processing. A game never need to get the image back from the GPU, so GPUs are not designed to deal with this use case.)
In any case, when you ask the GPU to give the data back to you, it causes what's called a "stall" -- the GPU needs to finish all operations at the moment you ask for the image. Such a stall destroys the parallel processing (= threading) that makes the GPU so fast. Moreover, the CPU needs to sit and wait for all the pending GPU operations to complete.
It is possible that we could make a Video Delay actor that keeps all the frames on the GPU, which would make it more efficient. The problem is it's not trivial to find out how much memory is available in the GPU and to get the actor to fail gracefully if there isn't enough GPU memory.
Again, every frame of a 1920x1080 image consumes 8.29MB. You want a ten second delay at 30 fps? That's 30 x 10 x 8.29MB = 2.4GB. A lot of GPUs could handle this, but some could would run out of memory. It was this fact that led to my decision to keep the delayed frames in CPU RAM.
Best Wishes,
Mark