I think your question is, while useful, slightly adjacent to the point, in a way that I think the following explanation will make clear.
First, for an event to be 'physical' at least classically, it needs to be associated with a location in space and time according to some observer. Let's just take for granted we are in flat spacetime and using Cartesian coordinates; the general case shouldn't matter for you.
Suppose Alice sees event $X$ occur at $t=5$s, $x=10$m (say) and $Y$ with $t=10$s, $x=15$m. Then Bob, who is moving relative to Alice, will generally see a time delay between $X$ and $Y$ which is larger than $5$s and a spatial separation which is shorter than $5$m.
Because of the specific way the transformations work, should the delay between $X$ and $Y$ be shorter than the time it would take light to travel between them, it may happen that Bob sees $Y$ occur before $X$. On the other hand, should the delay be longer than the light travel time, all observers agree on at least the order in which the events occur.
That temporal ordering depends on the light travel time comes in at a very early level in the physics, and a theory without this property at normal scales would probably be traumatically different from observation.
Now if $X$ causes $Y$ it must do so for all observers (at least, this is generally assumed). Otherwise one would get into weird situations where some observers see people born before their mothers etc. This unambiguous temporal ordering is what physicists are generally referring to when they discuss causality. In particular, two simultaneous events at different locations in space cannot cause one another.
There is nothing (at least nothing so low-level), however, that requires the causal effect of $X$ upon $Y$ to "take some time" having reached $Y$. It is just that "news" that $X$ has occurred takes finite time to propagate.