Code crashing when using multiprocessing.Pool

Tim

Hey all!!

My first post here! Exciting.
I'm starting to learn drawBot lately, things are developing pretty well by now. But so are my scripts and the complexity. It now takes me 2 hours to render 8 seconds with 24 fps... Hmm yeah, this way I can do only 4 renders in one working day. That's not gonna fly (yes, I could cleanup and make it more efficient. But I want the code flexible and modular for reuse)

I was looking at multiprocessing paralell processing for the loop, since this is a simple repetitive loop that doesn't share data with another item/frame.

BUT! it seems to crash when using drawBot in the pool. I've undressed the code as much as possible to get to the essence.

-- Edit --
Forgot to mention that the process just crashes without an error. Just gives me a crash popup and the console seems to freeze. I can still cancel in the console with command-c though to exit.
-- End edit --

As for the question.
How could I fix this or is there another/better way for multiprocessing?

import multiprocessing
import drawBot

canvasWidth = 500
canvasHeight = 500
fps = 12
seconds = 10 
duration = 1 / fps
frames = seconds * fps


def run(frame):
    print("frame: ", str(frame))
    drawBot.newDrawing()  # start from scratch, no multipage
    drawBot.newPage(canvasWidth , canvasHeight)
    # do cool stuff
    drawBot.saveImage(savePath + str(frame) + ".png")


if __name__ == '__main__':
    savePath = '_export/export_'

    p = multiprocessing.Pool(processes = multiprocessing.cpu_count()-1)
    
    
    for frame in range(frames):
        p.apply_async(run, [frame])
    
    p.close()
    p.join()
    
    print("Complete")

frederik

Jumping to multiprocessing seems like a very pro-level step.

I would encourage you to find out what takes some much time?

I would not generate high def 5k video's... The size of the canvas determines a to the generation time.

User @MauriceMeilleur did already some heavy research on generating big data videos. The big speed up was to generate each frame into a folder and collect every frame in a move afterwards. This prevents your memory to be flooded the data for every frame.

A big win can also be achieved by installing drawBot as module and run drawBot app-less

import drawBot
from drawBot.context.tools.mp4Tools import generateMP4

frames = 5
destinationPath = "/path/to/save/final/movie.gif"
frameRate = 1/30
images = []
for i in len(frames):
    drawBot.newDrawing()
    drawBot.newPage(200, 200)
    # do stuff
    path = "/path/to/save_%s.png" % i
    drawBot.saveImage(path)
    images.append(path)


generateMP4(images, destinationPath, frameRate)

also see http://forum.drawbot.com/topic/64/generate-animated-gifs-pdfs-similar-to-generatemp4/5

Tim

Hey @frederik, thanks for your reply!

Yeah,I've tried that approach already. Problem is that this takes even longer somehow. I've tried both options three times each. ( export as one gif and option 2 export seperate pngs and stitch them together.) But after 3 runs of each I found out it took 150% of the time exporting seperate png's compared to one gif option. (might have been because other processes? But I took 3 tests of each.)

And yes, I know the bottleneck of my script. It's not the size but the amount in the loops. Per frame I generate 80 rows and 100 columns of unique paths. Which comes down to 2400 paths per frame. And yes, I know of a way to simplify this. But I really don't want to, since it will create different problems.

Debugging I get this error

objc[65010]: +[NSBezierPath initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

I focus on this:

We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

Googeling on this I find that this might be a security thing in OSX.

frederik

woeps could you show an example code that ends up in this error?

and threading slows things down in most cases.

Tim

Success!! But more on that later

@frederik, the error is from the code above. It was run from PyCharm. I think it gives a more detailed feedback?
Apparently there was an update with MacOs 10.13 that disabled some python functions. Throwing this error.
And yes, my goal is to mulitprocess and not multithread.

I've tied a different approach as described in this document:
https://docs.python.org/3.4/library/multiprocessing.html
Instead of fork() I now try spawn() and this seems to work!!! I've tested my script 3 times without multiproc and got 3 x 73 minutes.
And with multiproc I got 2 results of 34 minutes and one result of 13 minutes. Now that's a difference...
Too bad I only got 13 min once.

I'll post my stripped code later so you can see the concept and maybe someone can optimize it.
For example, I've got 60 frames and it launches 60 processes at once.

MauriceMeilleur

All credit to @justvanrossum for telling me about the save-separate-frames-combine-after approach. But even short of doing that I got a huge boost in render speeds by making sure the saveImage() call is outside the for-loops generating frames. Animations/videos that used to take hours started rendering in minutes once I'd gotten that tip from Marcin Dybaś.

Tim

Hey @MauriceMeilleur, thanks for your reply!
Yeah, I saw your thread. Largest part of my implementation is based on that solution I think you and @justvanrossum would recognize the code. Also Just would recognize more I think. In this demo/test I've used a basic animation he showed us at a workshop in Amsterdam.

I've been able to reproduce the 13 minute times again. Turns out I got impatient running the last to test counts and opened some applications to return to work on other projects. After closing all apps and just let the script run I was able to get the same 13 minutes counts again. 1/5 of the serial processing time.

This is my solution so far. it is based on: Drawbot on something more powerful than desktops and laptops?
I have 3 files for this:

settings.py # these save the global settings. Would be a waste to (re)set them in both (all) the files.
main.py # yeah, well the main script
export_gif.py # after exporting I run this script to combine all the images to one animated gif file.

I'll show you the stripped base for this script. But first let me show you the base of the solution and work in progress: the part that calls multiple processes (with separated memory, if I'm not mistaken)
I still have two main obstacles to cross. I'll talk about them after the script. But first I'll show you the main part I'm currently focusing on, namely the multiprocessing part.

def main_mp():
    processes = multiprocessing.cpu_count()
    print( 'utilizing %d cores\n' % processes )

    ctx = multiprocessing.get_context('spawn')
    processes = []
    queue = ctx.Queue() # what does this do? :) :)

    for frame in range(set.frames):
        processes.append(ctx.Process(target=run, args=([ frame ], )))

    for p in processes:
        p.start()

    for p in processes:
        p.join()

    # for _ in range(times): # I've commented this one out. But what does it do? :) :)
    #     assert queue.get(), "One of processes failed."

That was the main focus. Now let's see all the scripts combined:
The settings.py are to share the same settings between, in this case, both scripts.

# settings.py

canvasWidth = 500
canvasHeight = 500
fps = 12
seconds = 10
duration = 1 / fps
frames = seconds * fps

pathExport = '___export/'
exportFileName = "export_"
exportFileType = "gif"

The main.py. run() has two demo parts. One that creates an image to see you can create an animated gif out of it. (code from workshop with Just) And one part that puts some load on the processors. (without this part in this demo it would actually be more efficient to run the code in serial instead of parallel
Also, I've included a main() and main_mp(). One runs a serial loop and the other is the parallel multiproc. ( so I can benchmark both approaches)

# main.py

import multiprocessing
import datetime
import drawBot
#
import settings as set



def run(args):
    frame = args[0]
    print( "frame: ", frame)

    drawBot.newDrawing()  # start from scratch, no multipage
    drawBot.newPage(set.canvasWidth, set.canvasHeight)

    time = frame / set.frames

    # teken een achtergrond
    drawBot.fill(1)  # wit
    drawBot.rect(0, 0, set.canvasWidth, set.canvasHeight)
    drawBot.fill(0)  # zwart

    angle = drawBot.radians(360 * time)
    x = set.canvasWidth / 2 + 100 * drawBot.sin(angle)
    drawBot.oval(x - 100, 200, 200, 200)

    drawBot.saveImage( set.pathExport + set.exportFileName + str(frame) + "." + set.exportFileType )

    for i in range(20):
        frame = frame * frame


def main_mp():
    processes = multiprocessing.cpu_count()
    print( 'utilizing %d cores\n' % processes )

    ctx = multiprocessing.get_context('spawn')
    processes = []
    queue = ctx.Queue() # what does this do? :) :)

    for frame in range(set.frames):
        processes.append(ctx.Process(target=run, args=([ frame ], )))

    for p in processes:
        p.start()

    for p in processes:
        p.join()

    # for _ in range(times): # I've commented this one out. But what does it do? :) :)
    #     assert queue.get(), "One of processes failed."


def main():
    for frame in range(set.frames):
        run( [ frame ] )


if __name__ == '__main__':
    startTime = datetime.datetime.now()

    #main_mp()
    main()

    print(datetime.datetime.now() - startTime)
    print("--End--")

And last but not least the export_gif.py
The input has to be gif for it to also output gif.

# export_gif.py
import settings as set

def exportGif():
    from drawBot.context.tools.gifTools import generateGif

    destinationPath = set.pathExport + "/movie.gif"
    images = []
    durations = []

    for frame in range(set.frames):
        path = set.pathExport + set.exportFileName + str(frame) + "." + set.exportFileType
        images.append(path)
        durations.append(set.duration)

    generateGif(images, destinationPath, durations)

if __name__ == '__main__':
    exportGif()

So, my main issue at this time is:

main_mp() is not finished yet. It's not using the/a queue. Which results in the script starting (in this case) 120 individual processes. Running all of them together is, I'm sure, not efficient. It switches between them all the time.

multiprocessing.cpu_count()

This gives the the information I have 8 cpu cores available. I think it would be more interesting to have cpu_count()-1 running.

As for my question at this moment:
Does anyone know how to make main_mp() more efficient and/or how to properly implement a processing queue?

New intrigued question:
Also, @MauriceMeilleur, could you elaborate on you mentioning placing saveImage() outside the for-loop? How would you accomplice that?

Tim

I might have found a solution. I have to go somewhere soon, so I don't have much time to double check my findings. But I've added a function for making a working pool:

def main_pool():
    multiprocessing.set_start_method('spawn')
    p = multiprocessing.Pool( processes = (multiprocessing.cpu_count()-1))
    p.map(run, range(set.frames))
    p.close

main() # serial processing in 15.65 seconds
main_mp() # multiprocessing in 10.50 seconds
main_pool() # multiprocessing with a pool in 4.55 seconds

I have only tried this with the script above. But not with my bigger project.

Now, I think what remains is some insight on saveImage() outside a loop.

frederik

Im not very familiar with multiprocessing and how to get the full potential out of it. But please report back!

MauriceMeilleur

@Tim: here's an excerpt from something I'm editing right now. (In case you're curious, this is an exploration of a modular script designed by André Gürtler in 1966; it draws all the single shapes that make up the script in all their possible positions, then all the two-shape combinations in all their possible positions that pass a couple of tests.)

for m in range(len(matrix)):
    newPage(canvasX, canvasY)
    frameDuration(1/3)
    fill(1)
    rect(0, 0, canvasX, canvasY)
    translate(canvasX/2, canvasY/2)
    fill(1, 0, 0, .05)
    for l in range(len(matrix)):
        mat = BezierPath()
        mat = matrix[l]()
        drawPath(mat)
    fill(0)
    single = BezierPath()
    single = matrix[m]()
    drawPath(single)

for n in range(len(pairs)):
    compareShape(pairs[n][0](), pairs[n][1]())

saveImage('~/Desktop/gürtler_script_matrix.pdf')
saveImage('~/Desktop/gürtler_script_matrix.gif')

There are 45 single shape-positions and 650 valid two-shape-position combinations, for 695 frames/pages. Note that the saveImage() calls are outside the generating for-loops.

The code would also work if I put those calls inside the loops, but then DrawBot would make 695 versions of the final .gif and .pdf files: one with one frame/page, one with two frames/pages, one with three frames/pages, … overwriting the files in memory each time the code generates a new frame/page. That's super processor-intensive and a memory hog to boot.

By putting the saveImage() calls outside all the loops at the end of the code, DrawBot makes only one version of the final .gif and .pdf files, after all the frames/pages have been generated. Does that make sense?

(Marcin @dyb, feel free to jump in and correct anything!)

PS Sorry for the confusing and very hacky code—it's why I'm editing right now!

Tim

@frederik Sorry for the late reply, I've been on vacation for a while.
It's a little rusty after so many days The main focus lays on "main_pool()"

"multiprocessing.Pool()" creates, in my 8core case, 8 separate processes (ignore the -1 in the code)
Then "p.map()" runs function "run()" 120 times (12 fps times 10 sec. from the settings)
but only 8 at one time. As soon as one process is finished it runs the next one.
it saves the images separately and can be stitched together into one gif with "exportGif()"

More in depth info can be found on: https://docs.python.org/3.4/library/multiprocessing.html
My example is only useful if the processes don't have to talk to each other. But on the ref page above should be other examples as well.

I'll post the result of my script in a few hours. Well the result in which I've used this method.