Jasper, Marc2021-08-182021-08-182021http://hdl.handle.net/2003/4047310.17877/DE290R-22348This thesis by publications focuses on realistic benchmarks for software verification approaches. Such benchmarks are crucial to an evaluation of verification tools which helps to assess their capabilities and inform potential users. This work provides an overview of the current landscape of verification tool evaluation and compares manual and automatic approaches to benchmark generation. The main contribution of this thesis is a new framework to synthesize realistic verification tasks. This framework allows to generate verification tasks that target sequential or parallel programs. Starting from a realistic formal specification, a Büchi automaton is synthesized while ensuring realistic hardness characteristics such as the number of computation steps after which errors occur. The resulting automaton is then transformed to a Mealy machine to produce a sequential program in C or Java or to a parallel composition of modal transition systems. A refinement of the latter is encoded in Promela or as a Petri net. A task that targets such a parallel system requires checking whether or not a given interruptible temporal property is satisfied or whether parallel systems are weakly bisimilar. Temporal properties may include branching-time and linear-time formulas. For the latter, it can be ensured that every parallel component matters during verification. This thesis contains additional contributions that build on top of attached publications. These are (i) a generalization of interruptibility that covers branching-time properties, (ii) an improved generation of parallel contexts, and (iii) a definition of alphabet extension on a semantic level. Alphabet extensions are a key part for ensuring hardness of generated tasks that target parallel systems. Benchmarks that were synthesized using the presented framework have been employed in the international Rigorous Examination of Reactive Systems (RERS) Challenge during the last five years. Several international teams attempted to solve the corresponding verification tasks and used ten different tools to verify the newly added parallel programs. Apart from the evaluation of these tools, this endeavor motivated participants of RERS to conceive new formal techniques to verify parallel systems. The result of this thesis thus helps to improve the state of the art of software verification.enBenchmark generationProgram verificationModel checkingBisimulation checkingTemporal logicsLTLHMLCTLHard benchmarksError witnessesSynthesisParallel decompositionProperty preservationModal transition systemsModal contractsAlphabet extension004Synthesizing realistic verification tasksdoctoral thesisBenchmarkProgrammverifikationModel CheckingBisimulationTemporale LogikSynthese