.. Neighbour Loop Task Mladen Ivkovic Sep 2020 .. _task_adding_your_own_neighbour_loop: .. highlight:: c Adding a Particle Interaction/Neighbour Loop Task ================================================= There are quite a few subtle and not so subtle differences when adding tasks that include particle neighbour loops and/or particle interactions compared to "independent" tasks, where no particles interact with each other, but work is done on them individually. Particle interactions are handled on a cell basis. If only particles of one cell interact with each other, the task is referred to as a ``self`` task. When particles of two different cells interact with each other, we call the task type ``pair``. For a particle neighbour loop, one typically requires both ``self`` and ``pair`` tasks. Furthermore, sometimes interaction tasks which have too much of a workload can get split into smaller units of work with tasks of the type ``sub_self`` and ``sub_pair``. For the next paragraphs, let's assume that we want to implement the task ``new_iact``. Adding it to the Task List -------------------------- First you will need to add it to the task list situated in ``task.h`` and ``task.c``. Here we aren't adding a task **type**, as the type will always be a ``pair`` or a ``self``, but **subtypes**. In ``task.h``, you need to provide an additional entry to the enum ``task_subtypes`` (e.g. ``task_subtype_new_iact``). The last entry ``task_subtype_count`` should always stay at the end as it is a counter of the number of elements. For example:: enum task_subtypes { task_subtype_none = 0, task_subtype_density, task_subtype_gradient, task_subtype_force, task_subtype_limiter, task_subtype_grav, task_subtype_external_grav, task_subtype_tend_part, task_subtype_tend_gpart, task_subtype_tend_spart, task_subtype_tend_sink, task_subtype_tend_bpart, task_subtype_xv, task_subtype_rho, task_subtype_part_swallow, task_subtype_bpart_merger, task_subtype_gpart, task_subtype_multipole, task_subtype_spart_density, task_subtype_stars_density, task_subtype_stars_feedback, task_subtype_new_iact, task_subtype_count } __attribute__((packed)); In ``task.c``, you will find an array containing the name of each task and need to add your own (e.g. ``new_iact``). Be careful with the order that should be the same than in the previous list. For example:: const char *subtaskID_names[task_subtype_count] = { "none", "density", "gradient", "force", "limiter", "grav", "external_grav", "tend_part", "tend_gpart", "tend_spart", "tend_sink", "tend_bpart", "xv", "rho", "part_swallow", "bpart_merger", "gpart", "multipole", "spart", "stars_density", "stars_feedback","sf_count", "bpart_rho", "sink", "new_iact" }; Adding it to the Cells ---------------------- Each cell contains a list to its tasks and therefore you need to provide a link for it. In ``cell_.h``, add a pointer to a task in the structure. For example, cooling couples to the hydro particles, so we'll be adding our task to ``cell_hydro.h``. We won't be adding just one task though, but an entire (linked) list of them, since we're going to need a ``self`` type task and multiple ``pair`` type tasks to have a complete neighbour loop. So instead of pointing to a single task, we store a struct ``link`` in the cell struct. For example:: /** * @brief Hydro-related cell variables. */ struct cell_hydro { /*! Pointer to the #part data. */ struct part *parts; /*! Pointer to the #xpart data. */ struct xpart *xparts; /* Lot of stuff */ /*! Task for sorting the stars again after a SF event */ struct task *stars_resort; /*! My new interaction task */ struct link *new_iact; /* Lot of stuff after */ }; Adding a new Timer ------------------ As SWIFT is HPC oriented, any new task need to be optimized. It cannot be done without timing the function. In ``timers.h``, you will find an enum that contains all the tasks. You will need to add yours inside it. For example:: enum { timer_none = 0, timer_prepare, timer_init, timer_drift_part, timer_drift_gpart, timer_kick1, timer_kick2, timer_timestep, timer_endforce, timer_dosort, timer_doself_density, timer_doself_gradient, timer_doself_force, timer_dopair_density, timer_dopair_gradient, timer_dopair_force, timer_dosub_self_density, timer_dosub_self_gradient, timer_dosub_self_force, timer_dosub_pair_density, timer_dosub_pair_gradient, timer_dosub_pair_force, timer_doself_subset, timer_dopair_subset, timer_dopair_subset_naive, timer_dosub_subset, timer_do_ghost, timer_do_extra_ghost, timer_dorecv_part, timer_do_cooling, timer_gettask, timer_qget, timer_qsteal, timer_locktree, timer_runners, timer_step, timer_cooling, timer_new_iact, timer_count, }; As for ``task.h``, you will need to give a name to your timer in ``timers.c``:: const char* timers_names[timer_count] = { "none", "prepare", "init", "drift_part", "kick1", "kick2", "timestep", "endforce", "dosort", "doself_density", "doself_gradient", "doself_force", "dopair_density", "dopair_gradient", "dopair_force", "dosub_self_density", "dosub_self_gradient", "dosub_self_force", "dosub_pair_density", "dosub_pair_gradient", "dosub_pair_force", "doself_subset", "dopair_subset", "dopair_subset_naive", "dosub_subset", "do_ghost", "do_extra_ghost", "dorecv_part", "gettask", "qget", "qsteal", "locktree", "runners", "step", "cooling", "new_iact", }; You can now easily time your functions by using:: TIMER_TIC; /* Your complicated functions */ if (timer) TIMER_TOC(timer_new_iact); Adding your Task to the System ------------------------------ Now the tricky part happens. SWIFT is able to deal automatically with the conflicts between tasks, but unfortunately cannot understand the dependencies. To implement your new task in the task system, you will need to modify a few functions in ``engine_maketasks.c``. First, you will need to add mainly three functions: ``scheduler_addtask``, ``engine_addlink``, ``scheduler_addunlocks`` in the ``engine_make_extra_hydroloop_tasks_mapper`` functions (depending on the type of task you implement, you will need to write it to a different function). The (hydro) particle interaction tasks are first created only for the density loop, and then replicated in ``engine_make_extra_hydroloop_tasks_mapper`` for everything else. In ``engine_make_extra_hydroloop_tasks_mapper``, we add the task through the following call:: struct task *t_new_iact = NULL; /* ... lots of stuff ... */ /* Self-interaction? */ else if (t_type == task_type_self && t_subtype == task_subtype_density) { /* ... lots of stuff ... */ t_new_iact = scheduler_addtask(sched, task_type_self, task_subtype_new_iact, flags, 0, ci, NULL); /* Link the tasks to the cells */ engine_addlink(e, &ci->new_iact, t_new_iact); /* Create the task dependencies */ scheduler_addunlock(sched, ci->task_that_unlocks_this_one, t_new_iact); scheduler_addunlock(sched, t_new_iact, ci->task_that_will_be_unlocked_by_this_one); } /* Otherwise, pair interaction? */ else if (t_type == task_type_pair && t_subtype == task_subtype_density) { /* ... lots of stuff ... */ t_new_iact = scheduler_addtask(sched, task_type_pair, task_subtype_new_iact, flags, 0, ci, cj); engine_addlink(e, &ci->new_iact, t_new_iact); engine_addlink(e, &cj->new_iact, t_new_iact); /* ... lots of stuff ... */ if (ci->nodeID == nodeID) { /* ... lots of stuff ... */ scheduler_addunlock(sched, ci->task_that_unlocks_this_one, t_new_iact); scheduler_addunlock(sched, t_new_iact, ci->task_that_will_be_unlocked_by_this_one); } if (cj->nodeID == nodeID) { if (ci->hydro.super != cj->hydro.super) { /* ... lots of stuff ... */ scheduler_addunlock(sched, cj->task_that_unlocks_this_one, t_new_iact); scheduler_addunlock(sched, t_new_iact, cj->task_that_will_be_unlocked_by_this_one); } } } /* Otherwise, sub-self interaction? */ else if (t_type == task_type_sub_self && t_subtype == task_subtype_density) { /* You need to do the same as for task_type_self above */ } /* Otherwise, sub-pair interaction? */ else if (t_type == task_type_sub_pair && t_subtype == task_subtype_density) { /* You need to do the same as for task_type_pair above */ } The next step is to activate your task in ``engine_marktasks_mapper`` in ``engine_marktasks.c``:: /* Single-cell task? */ if (t_type == task_type_self || t_type == task_type_sub_self) { /* ... lots of stuff ... */ else if (t_subtype == task_subtype_new_iact) { scheduler_activate(s, t); } } /* Pair? */ else if (t_type == task_type_pair || t_type == task_type_sub_pair) { /* ... lots of stuff ... */ else if (t_subtype == task_subtype_new_iact) { scheduler_activate(s, t); } } Then you will need to update the estimate for the number of tasks in ``engine_estimate_nr_tasks`` in ``engine.c`` by modifying ``n1`` or ``n2``. ``n1`` is the expected maximal number of tasks per top-level/super cell. ``n2`` ``n2`` is the expected maximum number of tasks for all other cells, independent of the depth of the tree. Most likely ``n2`` won't need updating, and you will only need to update ``n1``. As to how to update ``n1``, you just need to count the number of tasks that you will be adding, e.g. 1 self + (3^3-1)/2 = 13 pair tasks + 1 ghost, etc... All these numbers can be overwritten at run time by the user anyway in the parameter file (``Scheduler: tasks_per_cell``). Then give the task an estimate of the computational cost that it will have in ``scheduler_reweight`` in ``scheduler.c``:: case task_type_self: if (t->subtype == task_subtype_grav) { cost = 1.f * (wscale * gcount_i) * gcount_i; /* ... lots of stuff ... */ else if (t->subtype == task_subtype_new_iact) cost = 1.f * wscale * scount_i * count_i; else error("Untreated sub-type for selfs: %s", subtaskID_names[t->subtype]); break; Similarly, you'll need to update ``case task_type_sub_self``, ``task_type_pair``, and ``task_type_sub_pair`` as well. This activates your tasks once they've been created. Initially, the engine will need to skip the task that updates the particles. If this is the case for your task, you will need to add it in ``engine_skip_force_and_kick``. Additionally, the tasks will be marked as 'to be skipped' once they've been executed during a time step, and then reactivated during the next time step if they need to be executed again. This way, all the created tasks can be kept and don't need to be recreated every time step. In order to be unskipped however, you need to add the unskipping manually to ``engine_do_unskip_mapper()`` in ``engine_unskip.c``. Finally, you also need to initialize your new variables and pointers in ``space_rebuild_recycle_mapper`` in ``space_recycle.c``. Additionally, you need to initialize the ``link`` structs in ``cell_clean_links`` in ``cell.c``. Implementing your Task ---------------------- The last part is situated in ``runner_main.c``, where the actual functions executed by the task are called inside the function in ``runner_main`` in the switch:: /* Different types of tasks... */ switch (t->type) { case task_type_self: if (t->subtype == task_subtype_density) runner_doself1_branch_density(r, ci); /* ... lots of stuff ... */ else if (t->subtype == task_subtype_new_iact) runner_doself_branch_new_iact(r, ci, 1); else error("Unknown/invalid task subtype (%s).", subtaskID_names[t->subtype]); break; case task_type_pair: /* ... lots of stuff ... */ else if (t->subtype == task_subtype_new_iact) runner_dopair_branch_new_iact(r, ci, cj, 1); else error("Unknown/invalid task subtype (%s/%s).", taskID_names[t->type], subtaskID_names[t->subtype]); break; case task_type_sub_self: /* ... lots of stuff ... */ else if (t->subtype == task_subtype_new_iact) runner_dosub_self_new_iact(r, ci, 1); else error("Unknown/invalid task subtype (%s/%s).", taskID_names[t->type], subtaskID_names[t->subtype]); break; case task_type_sub_pair: /* ... lots of stuff ... */ else if (t->subtype == task_subtype_new_iact) runner_dosub_pair_new_iact(r, ci, cj, 1); else error("Unknown/invalid task subtype (%s/%s).", taskID_names[t->type], subtaskID_names[t->subtype]); break; The functions ``runner_doself1_branch_density``, ``runner_dopair_branch_new_iact``, ``runner_dosub_self_new_iact``, and ``runner_dosub_pair_new_iact`` still need to be implemented by you. If you only plan on doing this type of particle interaction once per time step, you can get away with directly implementing these functions and call it a day. But if you intend to use the same kind of particle loop more than once, as it's done in e.g. the hydro density and force loops, it's better to construct the functions using macros. For example, you could have a file ``runner_doiact_my_stuff.h``:: /* File runner_doiact_my_stuff.h */ #define PASTE(x, y) x##_##y #define _DOSELF1_BRANCH_NEW(f) PASTE(runner_doself_branch, f) #define DOSELF1_BRANCH_NEW _DOSELF1_BRANCH_NEW(FUNCTION) #define _DOPAIR1_BRANCH_NEW(f) PASTE(runner_dopair_branch, f) #define DOPAIR1_BRANCH_NEW _DOPAIR1_BRANCH_NEW(FUNCTION) #define _DOSUB_PAIR1_NEW(f) PASTE(runner_dosub_pair, f) #define DOSUB_PAIR1_NEW _DOSUB_PAIR1_NEW(FUNCTION) #define _DOSUB_SELF1_NEW(f) PASTE(runner_dosub_self, f) #define DOSUB_SELF1_NEW _DOSUB_SELF1_NEW(FUNCTION) #define _IACT_NEW(f) PASTE(runner_iact, f) #define IACT_NEW _IACT_NEW(FUNCTION) void DOSELF1_BRANCH_NEW(struct runner *r, struct cell *c, int timer); void DOPAIR1_BRANCH_NEW(struct runner *r, struct cell *ci, struct cell *cj, int timer); void DOSUB_SELF1_NEW(struct runner *r, struct cell *ci, int timer); void DOSUB_PAIR1_NEW(struct runner *r, struct cell *ci, struct cell *cj, int timer); And a second file, ``runner_doiact_function_my_stuff.h``, where you define those functions which have been declared using the macros, e.g. :: #include "runner_doiact_my_stuff.h" void DOSELF1_BRANCH_NEW(struct runner *r, struct cell *c, int timer) { /* do your stuff, call IACT_NEW(...) at some point...*/ } void DOPAIR1_BRANCH_NEW(struct runner *r, struct cell *ci, struct cell *cj, int timer) { /* do your stuff, call IACT_NEW(...) at some point...*/ } void DOSUB_SELF1_NEW(struct runner *r, struct cell *c, int timer) { /* do your stuff, call IACT_NEW(...) at some point...*/ } void DOSUB_PAIR1_NEW(struct runner *r, struct cell *ci, struct cell *cj, int timer) { /* do your stuff, call IACT_NEW(...) at some point...*/ } Then we also need a ``runner_doiact_my_suff.c`` file where the functions declared in ``runner_doiact_my_suff.h`` are defined by including them with ``FUNCTION`` defined:: #include /* other includes too... */ /* Import the new interaction loop functions. */ #define FUNCTION new_iact #include "runner_doiact_functions_my_stuff.h" #undef FUNCTION Finally, we include them in ``runner_main.c`` as follows:: /* ... lots of includes and stuff ... */ /* Import new interaction loop functions. */ #define FUNCTION new_iact #include "runner_doiact_my_suff.h" #undef FUNCTION /** * @brief The #runner main thread routine. * * @param data A pointer to this thread's data. */ void *runner_main(void *data) { /* ... */ } The functions ``runner_doself_branch_density``, ``runner_dopair_branch_new_iact``, ``runner_dosub_self_new_iact``, and ``runner_dosub_pair_new_iact`` will be properly found and linked this way. All that's left for you to do is to write the function into which ``IACT_NEW`` will expand, in the above case it would be ``runner_iact_new_iact``. If you intend on using existing neighbour loops with a new different name, or plan on including your newly written header files multiple times with slight changes that can be dealt with a preprocessing macro, you can define a new macro like this:: /* ... lots of includes and stuff ... */ /* Import new interaction loop functions. */ #define FUNCTION new_iact #define FUNCTION_TASK_LOOP TASK_LOOP_NEW_IACT #include "runner_doiact_my_suff.h" #undef FUNCTION #undef FUNCTION_TASK_LOOP /** * @brief The #runner main thread routine. * * @param data A pointer to this thread's data. */ void *runner_main(void *data) { /* ... */ } However, in that case you will also need to give your ``TASK_LOOP_NEW_IACT`` a unique definition in ``src/runner.h``, e.g.:: /* Unique identifier of loop types */ #define TASK_LOOP_DENSITY 0 #define TASK_LOOP_GRADIENT 1 #define TASK_LOOP_FORCE 2 #define TASK_LOOP_LIMITER 3 #define TASK_LOOP_FEEDBACK 4 #define TASK_LOOP_SWALLOW 5 #define TASK_LOOP_SINK_FORMATION 6 #define TASK_LOOP_NEW_IACT 7 Finalizing your Task -------------------- Now that you have done the easiest part, you can start debugging by implementing a test and/or an example. Before creating your merge request with your new task, do not forget the most funny part that consists in writing a nice and beautiful documentation ;) Things to Keep in Mind ---------------------- - If you are inserting a new neighbour loop in between existing loops, or want to insert more than one neighbour loop, usually a new ghost task in between them is also needed. - Neighbour loops may also require MPI communication tasks.