Using eye-tracking as a window on cognitive processing, this study investigates language effects on attention to motion events in a non-verbal task. We compare gaze allocation patterns by native speakers of German and Modern Standard Arabic (MSA), two languages that differ with regard to the grammaticalization of temporal concepts. Findings of the non-verbal task, in which speakers watch dynamic event scenes while performing an auditory distracter task, are compared to gaze allocation patterns which were obtained in an event description task, using the same stimuli. We investigate whether differences in the grammatical aspectual systems of German and MSA affect the extent to which endpoints of motion events are linguistically encoded and visually processed in the two tasks. In the linguistic task, we find clear language differences in endpoint encoding and in the eye-tracking data (attention to event endpoints) as well: German speakers attend to and linguistically encode endpoints more frequently than speakers of MSA. The fixation data in the non-verbal task show similar language effects, providing relevant insights with regard to the language-and-thought debate. The present study is one of the few studies that focus explicitly on language effects related to grammatical concepts, as opposed to lexical concepts.