Real Learning in Virtual Worlds - CHAPTER 5: Discussion & Conclusion

From RiskWiki

CHAPTER 5: Discussion & Conclusion

5.1 Introduction

This chapter provides the analysis of the results presented in the previous chapter along with a discussion of these results and opportunities for further research.


In analysis of the results the researcher has applied both quantitative and qualitative methods in order to answer the research question: How effective is it to learn in a virtual world using a traditional 2D slide show method compared to that of a 3D interactive simulation?


Quantitative methods were performed on participant’s achievement scores for the pre and post quiz and Likert scale results. Qualitative methods were used on responses from participant’s post survey open questions results.


Discussion of results applied triangulation combining both the quantitative and qualitative results in order to better understand the 2D and 3D group’s learning experience and any differences that were observed between these groups.


This chapter concludes with a discussion on the opportunities for further research.


5.2 Quantitative Analysis

5.2.1 The Results of the Hypothesis

The aim of this study was to determine if two lectures differing only in the presence or absence of 3D models (and therefore employing either 2D or 3D learning delivery) in an online 3D virtual world would produce different learning outcomes for Bloom’s cognitive processes of ‘remember’ or ‘understand’. The following hypotheses were formed:


  • (H1): That the learning outcomes for Bloom’s factual knowledge of ‘remember’ cognitive process will result in a significant difference in post-quiz scores between 2D and 3D participants.
  • (H2): That the learning outcomes for Bloom’s factual knowledge of ‘understand’ cognitive process will result in a significant difference in post-quiz scores between 2D and 3D participants.


Measured statistically, neither of the above hypotheses were sustained by the scored (quiz) testing results as there was no significant statistical difference between the results of the two groups. The researcher applied statistical significance testing as the foundation for rejection of the null hypothesis formation of the above hypotheses (i.e. that, in each case, the process will result in NO significant difference) based upon a statistically measurable difference. If there is no measurable difference found between the samples; the primary hypotheses remains unconfirmed. An unconfirmed hypothesis does not mean the hypothesis is false rather it means it is capable of disproof thus unconfirmed (Karl Popper’s principals of falsifiability).


As the researcher was not able to refute the null hypothesis on the basis of a raw statistical comparison of the test scores, the researcher turned to the real data results to see if there was an actual (although possibly not significant) difference between the results of the two groups, or any clearly emerging or suggested trends that might qualify the implications of the raw statistical comparison.


5.2.2 The Results of the Pre-Quiz

5.2.2.1 Pre-Quiz Total Scores

Analysis of the results in the previous chapter for the total pre-quiz scores (i.e. both cognitive processes combined) between the 2D and 3D groups shows:

  • The pass rate for the 2D and 3D groups were 51% and 55% respectively, therefore 4% of 3D participants scored better than the 2D participants for a pass rate of 4 out of 8.
  • Average scores (mean) for the 2D and 3D groups were 3.69 and 3.68 respectively. Both groups’ average scores were effectively the same.
  • Median scores for the 2D and 3D groups were both the same with a value of 4.
  • Mode for the 2D group was lower than the 3D group, 3 and 4 respectively. Effectively demonstrating that more 2D participants scored a 3 whereas more 3D participants scored a 4. A score of 3 for the 2D and 3D groups were 31% and 23% respectively and a score of 4 for the 2D and 3D groups were 20% and 23% respectively.
  • The range of scores for the 2D group was less than the 3D group, 1-6 and 0-7 respectively.
  • Standard deviation for the 2D groups was less than the 3D group 1.372 and 1.479 respectively, therefore the 2D total groups’ scores were closer to the centre of the mean (average score) than the 3D group.
  • Skewness was positive for the 2D group and negative for the 3D group, 0.007 and -0.188 respectively. This demonstrates that the *3D groups’ scores were slightly higher than the 2D scores. This skewness difference is due to the mode difference between the groups as both the median and average scores where equal.
  • Kurtosis was negative (platykurtic) for the both groups. Platykurtic distributions are flatter at the top of a distribution curve and less peaked around the average score (mean). The slight difference in value of kurtosis across the two groups accounts for the probability density value being lower in the Gaussian distribution graph in Figure 62. Results: Pre-Quiz Totals - Histogram & Bell Curve


Summary & Interpretation: Pre-Quiz Total Scores


There was a 4% higher pass rate for the 3D group and the mode value of the 3D group was higher than the 2D groups’ total pre-quiz scores. The pass rate was higher because of the greater mode value obtained by the 3D group. The 3D group obtained a greater range of scores than the 2D group thus providing the 2D group with a tighter (smaller) distribution of scores around the mean.


Given the distribution of scores between the two groups the 2D group had a higher probability of scoring around the mean than the 3D group (28% and 26% respectively). Thus, although the 3D group obtained a higher pass rate and mode value, a participant in the 2D group was 2% more likely of scoring a 4 than a participant in the 3D group. This small percentage difference can be seen in Figure 61 inverse normal distribution graph, in the lower and higher quartiles the 2D group varied away from the 3D group. In the lower, quartile participants in the 2D group scored higher. In the higher quartile, participants in the 2D group scored lower. Thus this slight shift away from the 3D group curve toward the mean demonstrates that the 2D group was more likely to obtain the mean value than the 3D group.


Although there was a difference in the 2D and 3D group pre-quiz scores the percentage difference was, in the opinion of this researcher, effectively immaterial; showing that both groups stated with the same level knowledge on the topic ‘The Physics of Bridges’ prior to the lecture.


The result of the question 21 in the Likert scale survey is comparative with the above analysis. When asked to scale their level of knowledge on the topic ‘prior’ to the subject the low plus medium scores for the 2D and 3D participants were 98% and 96% respectively. The response that their knowledge was high from the 2D and 3D participants was 2% and 4% respectively. This provides a 2% difference for both responses, which is comparative to the real results of the data analysis above. So the difference in the participant group’s subjective assessment matches that showed by the tested assessment.


5.2.2.2 Pre-Quiz Remember and Understand Scores

In the previous chapter we found that when a significance test was performed independently on Bloom’s cognitive processes of ‘remember’ and ‘understand’ for the pre-quiz a significant difference was found between the two groups. The 2D group scored significantly higher than the 3D group for the Bloom’s cognitive process of ‘remember’ (t = 1.665, df = 109, one-tailed p = 0.0494, α = 0.05), and the 3D group scored significantly higher than the 2D group for the Bloom’s cognitive process of ‘understand’ (t = -3.03167, df = 109, one-tailed p = 0.00138, α = 0.05).


The pass rates for Bloom’s ‘remember’ cognitive process for the 2D and 3D groups were 80% and 66% respectively. The pass rates of Bloom’s ‘understand’ cognitive process for the 2D and 3D groups were 35% and 52% repetitively. The average score for the 2D and 3D groups for Bloom’s ‘remember’ was 2.44 and 2.071 and ‘understand’ 1.25 and 1.60 respectively. The standard deviation for the 2D and 3D groups for Bloom’s ‘remember’ were 1.032 and 0.775 and for Bloom’s ‘understand’ were 1.263 and 0.867 respectively.


The scores for the Bloom’s splits at the pre-quiz stage are of passing interest in this experiment (independent of the post-quiz results) and the significant differences found for these figures were not especially surprising.


This experiment was not designed to measure and compare pre versus post learning outcomes of the participants. Rather, it was designed to find differences between the 2D and 3D groups comparative learning outcomes (i.e. the post-quiz results). In other words, the research was not trying to measure ‘by how much’ learning or understanding improves, but rather the relative difference in the final results between the 2D and 3D groups.


The pre-quiz was given to obtain an indicator of the general knowledge of the material that was to be delivered so that relative differences in outcomes could be normalised against the initial positions.


With the total number of pre-quiz questions being 8, of which both of the Bloom’s cognitive process were represented by only 4 questions each, there were not enough questions in each group to test reliably the true levels each of Bloom’s cognitive processes of ‘remember’ and ‘understand’ prior to the lecture. With so few data points for the individual processes, small variations in responses produce large variations in final scores. Hence the 2D/3D group variations were not especially surprising.


The problem for the research design was to avoid impacting the outcomes with the measurement instrument itself. The post-quiz was taken within approx 30 minutes of the pre-quiz, and only a single lecture was delivered, between those two measurement points. Providing more than 8 questions in the pre-quiz for a single 20 minute lecture would have increased the risk that the participants learnt from the pre-quiz questions relative to the lecture.


Furthermore, the concept of ‘remember’ and ‘understand’ for Bloom’s cognitive processes prior to instruction does not especially make sense in the context of the experiment. As discussed in Chapter 3 (instrument design), the development of the questions within the instrument was based upon the lecture. ‘Remember’ questions were extracted from the instructional content of the lecture whereas the ‘understand’ questions were derived from material not taught in the lecture. The pre-quiz questions were also specifically targeted at the four bridge types covered in the lecture to calibrate the extent of pre-existing content knowledge.


A participant being tested within each of these levels prior to instruction (over which no certainty of prior topic learning experience can be established) can only be measured with respect to their pre-existing general knowledge of the topic. This may reflect either memory or understanding. The extent to which this analysis grouped the pre-quiz questions into ‘remember’ or ‘understand’ in this discussion, reflects only the researcher’s perfect knowledge of the lecture content as to whether the topic of the question was subsequently directly taught or not in the lecture – not whether the participant was actually remembering or understanding at the pre-quiz stage.


The extent to which the split at the pre-quiz stage matters to the discussion is that if a participant already had an indicative level of ‘understanding’ prior to the lecture, that ‘understanding’ should improve when assessed after the lecture. If one group, for example, starts with a level of 60% and ends with 61%, this is possibly a worse outcome than the other group starting with 45% and ending with 58% (although there is also some discussion that could qualify even that conclusion).


5.2.3 The Results of the Post-Quiz

5.2.3.1 Post-Quiz Total Scores

An analysis of the results in the previous chapter for the total (i.e. combined Bloom’s) post-quiz scores between the 2D and 3D groups shows:

  • The pass rate for the 2D and 3D groups were 67% and 77% respectively, therefore 10% of 3D participants scored better than the 2D participants for a pass rate of 10 out of 20.
  • Average scores for the 2D and 3D groups were 10.98 and 11.36 respectively. A 3D participant scored on average 0.38 higher than a 2D participant.
  • Median scores for the 2D and 3D groups were 11 and 12 respectively. The 3D participants scored higher in the 2 quartile than the 2D participants.
  • Mode for the 2D group was lower than the 3D group, 11 and 12 respectively. Effectively demonstrating that more 2D participants scored 11 and more 3D participants scored 12. A score of 11 for the 2D and 3D groups were 20% and 21% respectively and a score of 12 for the 2D and 3D groups were 11% and 29% respectively.
  • The range of scores for the 2D group was more than the 3D group, 5-17 and 6-17 respectively.

Standard deviation for the 2D group was slightly more than the 3D group 2.468 and 2.347 respectively, therefore the 3D total groups’ scores were slightly closer to the centre of the mean (average score) than the 2D group.

  • Skewness was positive for the 2D group and negative for the 3D group, 0.052 and -0.229 respectively. This demonstrates that the 3D groups’ scores were slightly higher than the 2D scores. This skewness difference is due to the mean, median and mode differences between the two groups’ scores.
  • Kurtosis was negative (platykurtic) for the 2D group and positive (leptokurtic) for the 3D group, -0.2 and 0.3 respectively. As mentioned above platykurtic distributions are flatter at the top of a distribution curve whereas leptokurtic distributions are higher and peaked around the mean score. The differences in value of kurtosis between the two groups account for the probability density value being higher for the 3D group in the Gaussian distribution graph in Figure 64.


Summary & Interpretation: Post-Quiz Total Scores


The above analysis finds that the 3D participants scored overall better than the 2D participants in the post-quiz. Although this difference was not statistically significant from the t-test results (t = -0.8212, df = 119, two-tailed p = 0.4133, α = 0.05) the real results indicate that there was a slight difference between the two group results. Analysing the Gaussian distribution curve (Figure 64) shows that the 2D and 3D participants had a 15% and 16% likelihood respectively of scoring a 12 in their total post-quiz score. In general the overall results showed that the 3D group performed better by 1%, this can also be seen on the inverse distribution graph (Figure 63) where the two groups almost run parallel to one another with the 3D group performing approximately 1% better in their overall test results.


The results of question 22 in the Likert scale, when asked to scale their level of knowledge on the topic ‘after’ the lecture, the 2D and 3D participants low response was 22% and 23% respectively and medium response 73% and 74% respectively. At the medium level the self assessment was consistent with the test results of a 1% difference. At the low level the 3D group seemed to be more conservative in their response perceiving that their knowledge was less than the 2D group although the real result showed the contrary. In either case a 1% difference is within the margin of error.


5.2.3.2 Post-Quiz Remember Scores

Analysis of the results in the previous chapter for the post-quiz ‘remember’ scores between the 2D and 3D groups shows:

  • The pass rate for the 2D and 3D groups were 85% and 93% respectively, therefore 8% of 3D participants scored better than the 2D participants for a pass rate of 5 out of 10).
  • Average scores for the 2D and 3D groups were 7 and 7.32 respectively. The 3D participants scored on average 0.32 higher than the 2D participants.
  • Median and mode scores for the 2D and 3D group was 8 for both groups.
  • The range of scores for both groups was the same, 3-8.
  • Standard deviation for the 2D group was higher than the 3D group 1.8 and 1.6 respectively, with a 0.2 difference between the groups.
  • Skewness was negative for both groups with the 2D and 3D skew of -0.6 and -0.9 respectively. As both groups were close to 0 with a 0.3 difference between the two groups this demonstrates that the distribution of the results for both groups was almost symmetrical.
  • Kurtosis was negative (platykurtic) for the 2D group and positive (leptokurtic) for the 3D group, -0.7 and 0.7 respectively.


Summary & Interpretation: Post-Quiz Remember Scores


The post-quiz scores mask a complexity that requires further consideration. Although the 2D group was normality distributed, the 3D group failed D’Agostino-Pearson (K2) normal distribution test (p = 0.01161 ie < 0.05). In order to compare meaningfully the results of the 2D and 3D groups, the researcher needed to look into why the 3D group failed normal distribution and what, if anything, it implies to the interpretation of the apparently “better” 3D pass rates.


Analysis of the histogram and density traces graph Figure 65 show that both the 2D and 3D graph displays a bimodal distribution in the histogram graph with 2 peaks at 3 and 8. As can be seen on the density traces graph, for the 2D scores between the scores of 3-8, the variance was greater. This causes the curve to flatten prior to its peak.


Although the statistical analysis determined that difference between the pass rates and mean (by which the 3D group was higher than the 2D group) was not significant when taken as a whole there is a clear visual difference between the graphs that deserves explanation. When considered within specific score ranks the outcome slightly favours the 3D group because:

  1. 2D group participants were 8% more to likely to score 4 or below,
  2. 3D group participants were 6% more likely to score 8 or above, and
  3. 3D group participants were 2% more likely to score 9 or and above.


This analysis can be easily seen in frequency table: below (Table 13. Frequency Table: Post-Quiz Remember).


Post-Quiz Remember
Score 2D

(Cumulative)

3D

(Cumulative)

Difference

3D vs. 2D

0 0% 0% 0%
1 0% 0% 0%
2 0% 0% 0%
3 4% 4% 0%
4 15% 7% -8%
5 25% 13% -12%
6 33% 27% -6%
7 47% 41% -6%
8 78% 80% 2%
9 98% 96% -2%
10 100% 100% 0

Table 13. Frequency Table: Post-Quiz Remember (Rounded)


The frequency table show a cumulative analysis of each group at a particular score. As can be seen in the table, the 3D scores in general were lower than the 2D scores for each level of score below 8. The implication is therefore that the relative performance of 3D versus 2D ‘remember’ outcomes is slightly better at the higher rankings (80% and above), but slightly worse at the lower pass mark scores.


While the difference in the means may not be statistically significant, the results suggest that the outcomes at particular bands are potentially significant. To put this into context; if the desired group learning outcome is to achieve a pass or better, both methods of delivery were similar, but if the desired outcome is to maximise the potential scores, the 3D delivery might be indicated.


In general, the overall performance of both groups was better than for the score obtained in Bloom’s cognitive process of ‘understand’ which we will discuss in the next section.

5.2.3.3 Post-Quiz Understand Scores

Analysis of the results in the previous chapter for the post-quiz ‘understand’ scores between the 2D and 3D groups shows:

  • The pass rate for the 2D and 3D groups were 35% and 36% respectively, therefore 1% of 3D participants scored better than the 2D participants for a pass rate of 5 out of 10).
  • Average scores for the 2D and 3D groups were 3.98 and 4.04. A 3D participant scored on average 0.05 higher than a 2D participant.
  • Median and mode scores for the 2D and 3D group was 4 for both groups.
  • The range of scores for the 2D group was more than the 3D group, 0-8 and 1-8 respectively.
  • Standard deviation for the 2D group was slightly higher than the 3D group 1.48 and 1.46 respectively. A 0.02 difference between the groups shows very little difference in standard deviation.
  • Skewness was positive for both groups the 2D and 3D was 0.068 and 0.332 respectively. As both groups were close to 0 with a 0.27 difference between the two groups this demonstrates that the distribution of the results for both groups was almost symmetrical.
  • Kurtosis was positive (leptokurtic) for both groups with the 2D and 3D groups being 0.558 and 0.010 respectively. With a result of a 0.55 difference between the two groups shows very differences between the two groups kurtosis values.


Summary & Interpretation: Post-Quiz Understand Scores


From the above analysis both groups scored almost the same for Bloom’s post-quiz ‘understand’ results. This is clear from a study of the histogram and Gaussian distribution curve in Figure 66: both the 2D and 3D data points are almost identical.


Further, the frequency distribution comparison of the two groups confirms that the scored results at each rating band of the 2D and 3D groups exhibit no considerable difference.


Bloom’s cognitive process of ‘understand’ is a higher level cognitive process than ‘remember’. Given the pass results and the mean, median and mode scores both groups scored ‘badly’ (35% – 36%) in Bloom’s cognitive process of ‘understand’. On the face of it, the results suggest that both groups did not show a ‘high’ level of understanding of the subject matter after training; however, it should be remembered that the mean, median and mode results are a reflection of the difficulty relationship between the questions testing understanding and the lecture itself. The decision was made during the design stage to include some ‘very high’ difficulty questions in the understanding question set to ensure real test of the achieved level of understanding. Some additional light is shed on these results in the Likert scale and qualitative analysis that follows.


This research is primarily interested in the comparative difference of the 2 delivery methods, rather than the absolute scores, and for this purpose the results suggest that there is no significant or effective difference between the 2D and 3D group testing (quiz) results for the ‘understand’ cognitive process, within the confines of this experimental process.


5.2.4 Likert Scale Analysis

The above analysis of the quiz results showed that there was a positive result for the Bloom’s cognitive process of ‘remember’ whereas for Bloom’s ‘understand’ there seemed to be fewer participants in both groups that understood the subject matter on ‘The Physics of Bridges’ to the same level that they remembered it. In order to understand this result we will turn to the Likert scales where we asked the participants to assess the quality of the deliver method. Questions 23 and 24 specially answered these questions.

  • Question 23 asked whether “the subject matter was clear and informative”. The 2D and 3D groups’ responses were positive 98% and 100% and neutral 2% and 0% respectively. With exception to the 2% neutral response it would seem that the majority of people found the subject matter to be clear and informative. Of interest the 2% neutral result was a single participant who actually performed better than group’s average score for the post-quiz results in both cognitive processes of ‘remember’ and ‘understand’ with a z-score of 0.54 and 0.69 respectively. Given their actual results it seems that within their group that this participant understood the material better than they remembered it.
  • Question 24, was the lecture detailed enough to understand the subject matter. The 2D and 3D groups’ responses were positive 100% and 93% and neutral 0% and 7% respectively. Of interest were the neutral responses that came from the 3D group. These responses were made up of 4 participants all of whose post-quiz results in both cognitive processes of ‘remember’ and ‘understand’ scored less than the group’s average in their z-scores, with exception to one that scored better on their ‘understand’ post-quiz score than the ‘remember’ score.


From the above results of questions 23 and 24 the majority of participants perceived that the lecture material was clear, informative and detailed enough in order for them to understand the subject matter. The few in the 3D group who were only neutrally satisfied that the level of information detail was sufficient to understand the topic achieved post-quiz z-scores that were below average for the total group so their self assessment seemed to be correct.


Question 29 asked if the topic was appropriate to virtual world learning. This question was asked in order to gain an understanding of a participant’s view on the choice of topics that was delivered for instruction. The majority response for both groups was positive with the 2D and 3D group’s responses positive 84% and 79% respectively and neural 13% and 18% respectively. Within the 2D and 3D groups the neutral scores accounted for 7 and 10 participants respectively. For these participants in the 2D group the z-scores showed that 4 performed below average for the cognitive process of ‘remember’ and 2 for the cognitive process of ‘understand’. Within the 3D group the z-scores showed that 5 performed below average for the cognitive process of ‘remember’ and 7 for the cognitive process of ‘understand’. It seems from these results that although the majority of the participants where positive about the choice of topics a few were neutral with the appropriateness of the material to the environment, and more so in the 3D group, in spite of the fact that the material was identical in both cases. Given their z-score results from the neutral responses the 2D participants still performed better for ‘understand’ than ‘remember’, while within the 3D group the neutral responders appeared to not ‘remember’ or ‘understand’ the topic well – suggesting their relative (to the group) self assessment was consistent with their relative scored outcomes.


Question 28 asked a participant whether the in world learning method offered a better learning experience than their usual (real world) learning methods. The results showed between the 2D and 3D groups positive 74% and 73%, neutral 13% and 18% and negative 3% and 3% respectively. Although the overall results showed a positive result there was more variance with respect to quiz scores in their responses on this question.


Question 26 asked participants if they experienced any technical difficulties. The majority of participants in both groups did not indicate that they had had any technical difficulties. The responses for the 2D and 3D groups ‘No’ 91% and 93% and ‘Yes’ 9% and 7% respectively. For the participants that answered yes to this question the major problems were sound and picture loading delay (lag). All of these people commented that it was only for a short period and the problem was rectified quickly. Although a small number of participants answered yes to this question that they had no technological constraint, the open format questions showed slightly more experienced some technical issues (although apparently not perceived as sufficient to rank a “yes” in this question), which will be discussed in the next section.


This group of questions essentially assessed the participant’s perception of quality, appropriateness, purpose and “fit” to the medium of the experience. Necessarily the responses to these questions are likely to be coloured by the participant’s perception of the lecture delivery system experienced (i.e. 2D or 3D). Throughout this group of questions the responses were very strongly positive while the worst grade with a significant number of responders was neutral (excluding Q26). With the exception of the assessment of the clarity of the material, the Likert assessments slightly favoured the 2D delivery method.


The slight favouring of the 2D delivery could be either an absolute result, or a result coloured by raised expectations of one or other of the two delivery methods. We need to investigate, therefore, the qualitative analysis of the open questions to adequately interpret this slight bias in the results.


Question 26 was a check-question to allow explanation of the results in the other questions should the results therein had proven dramatically negative.


5.3 Qualitative Analysis

From the qualitative analysis of the post-survey responses many aspects came out about the learning experience of participants as well as the differences between the two groups in this study.


5.3.1 Thematic Analysis Results

As discussed in the previous chapter the results of the post survey open questions were grouped into themes and coded for qualitative analysis in order to provide further insight into the achievement results and the learning experience of participants. There were four themes that were found on analysis of the data as follows:

  • Virtual World Learning
  • Virtual Learning Campus
  • Lecture Delivery
  • Survey Instrument


In this section we provide a thematic analysis of these themes that emerged from the post-survey.


5.3.1.1 Virtual World Learning

This theme was specially related to the use of the virtual world platform as a learning tool rather than the delivery method of the presentation.


Convenience was the main factor mentioned from both groups. The theme identified included: doing it from home, in my own time and not having to travel in order to learn. These sorts of comments are not specific to virtual world learning technology as today many educational courses cater for students via online courses. However, there was a sense of presence that the participants felt from “being there with other people” and seeing others learn that seemed to make the experience more enjoyable to them over traditional or alternative learning methods. Quite a few commented on how the experience felt “personal like they were really sitting in a lecture room taking the course”, the atmosphere was relaxed, soothing, and providing less pressure than traditional class room methods of learning. These comments are interesting, partly because the lecture mirrored a real-world lecture in that it could not be “paused” by a participant and ran for a fixed time per slide, and a fixed time in total, so to some extent it was more rigid in delivery format than a real-world lecture in which the lecture might be paused while a question is asked and answered.


Another theme that emerged was that this medium offered a new way of learning where it was ‘on demand’ rather than a planned course where one would have to prepare in advance. Similar to searching the web to find out about a specific topic, participants felt that this medium offered them a way learn new material when they wanted and to experience this material rather than just read it over a webpage. The lectures ran on a continuous loop over the experimental period – so this perception is reasonable, in spite of the fact that the lectures were not actually ‘on demand’.


The technology seemed to offer a learning medium to reach people that traditionally would not formally learn or even use the virtual world for learning which they had not done before. It seemed to inspire people to want to learn more and do more learning exercises in and out of Second Life. For many participants this was a new experience they had never thought about using online virtual worlds as a learning platform, for them they had only used the medium as a game rather than taking a course. After experiencing this study many were inspired to seek out more leaning in Second Life or even in real life.


The overall impression from all the participants was that the virtual world learning experience was fun and enjoyable. Very few negative comments were made about the experience other than they could see that this may have the potential to not be taken seriously or possibly cheat. The experience seemed to open people’s minds about the opportunities that virtual world technology could be used seriously rather than just as a gaming environment. A comment from a participant that sums up the general impression of this technology being used as a learning tool:


I'm still not convinced that virtual learning can replace learning in real world but now I think it might be possible.


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.2 Virtual Learning Campus

This theme included comments made about the virtual learning campus, the setup and operations of the entire virtual learning environment in which the experiment was conducted.


The majority of comments were that the participants found it to be ‘user friendly’ and ‘easy to use’. The layout of the different rooms seemed to provide a fun way for them learn. There were only 2 people that commented on having a problem with the signage when they got to the post survey room they missed the board that told them how to take the post-quiz.


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.3 Lecture Delivery

This theme is where the majority of comments were made from participants. These comments directly related a participants learning experience of the research project. The range of comments was coded into sub-categories; format, information content, learning, facets of 3D learning, instruction, focus, navigation and technology constraints.


5.3.1.3.1 Format

This theme included comments on the layout and format of the slide presentation. The comments from both groups were mostly positive. Participants could offer comments in positive, negative or general sections of the survey. In total, across the 2D and 3D groups, there were comments clearly identified as positive 11, 24 and negative 3, 1 respectively in this theme.


The positive comments liked the layout of the slides and the way the information was presented. A few more negative comments came from the 2D group; one that they wished they had the ability to interact with the pictures on the screen, another wanted annotation on the images (similar to the interaction question) and someone had problems with the colour differentiation of the tension and compression markings (tension and compression was shown in red and green respectively suggesting either colour blindness or graphic card faults). Only one person from the 3D group made a negative comment in this area identifying a desire for more pictures on the slides (the slides in the 2D and 3D lectures were identical).


While the largest proportion of the responses to the general comments question were provided by the 3D group, a common suggestion received from both groups concerning the format was that they wished the presentation could be paused or controlled such as by forwarding or rewinding. As a proportion of each group that actually provided a comment at all, this comment was marginally more frequent among the 2D participants.


With respect to the 3D group’s comments about presentation speed, it seemed that although they had been presented with a model and voice over that mirrored the images of the slides and the text therein, they still desired the opportunity to read the slides to view the information. The time per slide and the slides themselves were identical in both the 2D and 3D lectures and set to allow sufficient time for reading the slide – in fact the voice over effectively read the slide to the participant. In the 3D case the addition of the 3D models in the same time window meant that participants had an additional vector of information to absorb in the same amount of time as the 2D participants. The researcher’s impression from the comments in this respect is that in the 2D case the motivator was about the desired to review and contemplate the information, while in the 3D case it was more to do with their ability absorb multiple information vectors simultaneously.


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.




5.3.1.3.2 Information Content

This theme included comments to do with information content in the presentation. There were 56 comments from the 2D group and 33 from the 3D group.


On the most part people found that the presentation very interesting and informative but in this area the 2D group seemed to be more satisfied than the 3D group. Within the 3D group a number of people desired more information or perceived the information was too technical to appreciate without additional enquiry or time – yet the information in both cases was identical.


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.3.3 Learning

This theme included comments to do with people obtaining new information. Both group comments here were very positive. All participants that commented in this group stated they enjoyed the experience of learning and gaining the new knowledge. Most seemed to enjoy the topic and the new knowledge that they took away with them on bridges and/or considered that the material was well thought out and presented. Some commented that they enjoyed the opportunity to obtaining new knowledge in the virtual world/game space were inspired to seek additional in-world learning.


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.3.4 Facets of 3D Learning

The comments in this category were specific to the 3D lecture with the use of models. The participants in the 3D group were universally positive about the use of 3D models. Many seemed to believe that having a model of the presentation assisted them in the understanding of the subject matter. (Note, however, that the test scores did not reflect a significant advantage from the 3D models with respect to understanding, although there were indications of an advantage in remembering).


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.3.5 Instruction

The comments in this category had to do with the method by which the new knowledge transferred to the participant. In this area a small but significant number of participants in both groups commented that they missed not having a real person to ask questions to clarify the information but more so in the 3D group which seemed to want to find out more information about the topic than was presented to them. (Note, as mentioned, the information was identical in both cases).


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.3.6 Focus

The comments in this category had to do with observations affecting attention and the temporal learning experience of a participant.


This theme emerged though the general comments throughout the survey. There seemed to two broad sub groups of comments in the focus them: the presence of distractions during the learning experience and the participant’s perception of the available time per slide for learning. Although both groups experienced the same general learning conditions and real-world times, there seemed to be opposing perception of the significance of sources of distraction and perceptions of time across the two groups during the presentation. We will break this category into these two sub-themes (distractions and time) to better understand the focus aspect of the participant groups.


Distractions

The sources of distractions seemed to come from either the outside world or the inside world.


Inside world distractions
Only 3 comments were made from the 2D group with distractions from the inside world experience: distracting avatars, a participant’s outfit getting in the way of their view and a participant distracted by their curiosity with the technology setup used to deliver and manage the lectures.


Whereas with the 3D group quite a number of people complained about inside world distractions, particularly being annoyed with other avatars disrupting their learning. As a group, the 3D participants were comparatively emotional/animated (with respect to the 2D group) in their response to these distractions and in a number of cases complained that the other people were not taking education as seriously as them.


Outside world distractions

A small number of the 2D group complained/commented about outside world distraction or commented upon the advantages of staying in touch with the outside world. Such comments as being able to answer the phone, using yahoo messaging, doing things at their desk and people in real life talking to them were some of the comments made from the 2D participants.


Whereas there was only one member of the 3D group commenting upon outside world distractions.


Time


The main theme that emerged from the 2D group was that a small number of participants commented that the presentation was a bit slow and/or that their attention wandered and/or that they “zoned out” during some slides. Contrast this with the 3D group who tended to say that the presentation was fast or a reasonable number even complained that it went too fast. The 3D group commented that the material kept them engaged and the presentation held their attention. In both cases the real-world times were identical – so the observations are directly related to perception, and in the light of other comments made, the implication is that there was a difference in perceived ‘engagement’ that arose from the single variable of the presence of the 3D objects.


The 2D participants who observed that occasionally they ‘zoned’ out during some of the slides also commented that the voice over was too smooth/calm. Nobody in the 3D group observed this problem, and conversely a number commented on how the voice over was exactly right for the presentation and kept their attention during the presentation. Interestingly the voiceover was identical in each case – but the presence of the 3D objects appearing around participants may have presented an additional level of stress that was properly countered by the voice over.


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.3.7 Navigation

Traditionally a significant problem in virtual-world training experiments, learning the appropriate method of avatar navigation has typically been compounded by the use of first-time virtual world participants unfamiliar with the control of their avatar. This researcher considered this a flaw in previous studies that distorted the results with a temporary experience that would be overcome with only a small amount of in-world experience. The participants in this study, therefore, were intentionally recruited from users already present in second-life rather than brought into the virtual world specifically for the purpose of the experiment.


Consequently the negative comments on navigation were lower than in previous studies, and not generally of the same fundamental ‘how do I operate my avatar?’ nature present in a number of the studies considered in the literature review. In any case the campus and lecture environment was specifically designed to minimise the likelihood of these types of problems, and required only minimal knowledge of avatar controls (sufficient for someone with about 30 minutes of experience – based on the packaged avatar training in the second-life orientation islands).


The comments in this category had to do with how their avatar viewed the presentation. These comments were complaints from the 2D and 3D participants about some viewing aspect of the presentation.


Three (3) of the 2D group complained that the chairs blocking their view of the presentation. It was obvious from this comment that these people lacked the knowledge to use mouse view and used third person view, and did not understand how to control the third person roaming camera effectively.


The 3D group’s complaints provided the most insight as to how they viewed the presentation. A small, but significant, number of the participants complained that the 3D models of the bridges ‘got in the way’ of their reading of the slides (a function of navigation) or that they could not both read the slides and look at the models (a function of time). Although avatars were not seated once the 3D presentation began, and avatars were free to wander around the space, with slides projected onto the walls around the models, some users clearly did not realise the additional freedom allowed them to position their avatar for clear slide viewing at any time. Further, it seemed, although presented with a 3D model and the voice over that covered the entire slide content a number of the 3D group still attempted to use the traditional method of viewing the slides whilst looking at the models.


Refer to Appendix M: Qualitative Analysis: A Sample of Participants Comments for a representative sample of comments from participants. Nonsense strings and repeated comments were excluded.


5.3.1.3.8 Technology Constraints

This category contained comments by participants about the technology constraints that they had experience during the lecture deliver. Although this question was also asked in the Likert questions as provided in the previous section above where the 2D and 3D groups responded ‘9% and 7% respectively, more participants made a comment identifying technical problems in their open comments.


From the 2D and 3D groups’ comments 20% and 18% respectively identified at least one technology constraint. Of these all of the participants had already answered yes in the Likert question therefore a further 11% in both groups commented upon having a technology related problems. The technical difficulties were due to sound and lag/object rezzing, the same problems given by the participants in the Likert questions.


As discussed in the literature review this technology is streamed in real time therefore ‘lag’ is a common risk in using this technology and will vary with network connection speed (real lag) and individual computer problem (false lag – but possibly the single most common culprit). Although no one made any comments that the lag affected their ability to learn. In most cases where it was reported the lag caused only a slight delay in the slide show with comment being that they experienced ‘some’ lag. As each slide, audio and object was independently synched together, lag problems could not accumulate across the slides and any synching problems were corrected with the next slide (or in some cases half way through a slide).


The sound constraints were only temporary in all cases. This problem was due to drop outs of the presentation voice-over. This problem was picked-up early in the testing phase where occasionally the audio would stop and a re-log of the application was required in order to get the audio back. As this was picked up in testing signs were placed around the lecture screens instructing the participant to re-log if they experienced audio dropouts. In all case if they complained about the audio drop, participants also noted that a re-log solved their problem quickly. The impact of an immediate re-log on the learning would be at most half the content of a slide would have been lost. As all slides were summarised at points during the presentation the participant was unlikely to completely miss the associated material.


5.3.1.4 Survey Instrument

This category included comments that related to the pre or post survey instrument.


There were 6 participants from both groups that commented upon the size of the pictures in the diagrams of the post-quiz being too small. From their comments they had trouble distinguishing some of the bridges in the pictures.


As the display size is based upon a person’s monitor size people that had small monitors may have had problems distinguishing the details in the pictures. The survey viewed correctly on a 17 inch monitor at 96 dpi, but anyone with a smaller than this or unusual resolution settings may (possibly) have had problems.


This problem was not realised until quite a number of participants had already completed the research. It was therefore decided that any change in the picture size in the survey would only corrupt the experiment conditions and may bias the results so no modification was made. Therefore all participants that undertook this research operated under the same picture constraints in the survey.


On review of the results of the 6 participants that complained 3 of these were from the 2D group and 3 from the 3D group. The participants results for the post-quiz scores for the 2D for ‘remember’ and ‘understand’ were; 9, 7; 9, 4; 9, 4 and the 3D group; 8, 4; 8, 5; 8, 4 respectively. All of these participants passed both Blooms’ cognitive processes categories. The average z-scores for their groups for ‘remember’ were all above average but the ‘understand’ showed that these participants were either on average score or scored below average.


There were 9 ‘remember’ questions and 8 ‘understand’ questions in the survey that required the participant to use pictures in answering the question. The Bloom’s cognitive process of ‘understand’ would have been more affected by the picture constraints. The questions in the Bloom’s ‘understand’ cognitive process were substantially more difficult with material that was not presented during the lecture therefore the participant had to use the picture to recognised and assimilate information in order to answer the question.


The researcher notes that this problem may have contributed to some of the low scores results especially within the Bloom’s cognitive process of ‘understand’. Although from the comments only 6 out of 111 people complained about this problem there is no way to know how much of a problem this presented, from the lack of comments we can only assume that this was not a constraint for most participants – or, at least, not one they were realising they were experiencing.


5.3.2 Qualitative Analysis of Thematic Results

5.3.2.1 Introduction

The Survey comment questions were not compulsory, but less than 4% reflected nonsense or non-responses with an average of 100 words per person, and 3D participants providing approximately 12% more comment volume than the 2D participants.


Interpreting the collected thematic responses was aided by the consistency of the emotion and approval expressed by participants, and the surprising number of instant messages sent directly to the researcher by participants in thanks for the experience, and the range of both supportive comments and recommendations provided in the open comments. To that end the researcher offers the following generalised collation of the qualitative opinions expressed by participants.


The general lack of negative observations reflects that same proportion in the underlying data. Three positive and three negative observations were requested as well as open/general comments. Overwhelmingly, the positive question was populated while the negative question was generally underpopulated, or comments like ‘I have none’. The most frequent negative comments were an expressed desire to control the delivery speed, acquire additional information in some way, or the opportunity for distraction. In some cases these were also identified as positives. The lack of colour in the negative comments was contrasted by the diversity of positive comments. Different participants chose to comment on different positive aspects of the experience, and an individual participant tended to concentrate comments within a theme.


To aid in interpretation of the analysis while avoiding the implication of hard statistical interpretation, where some degree of researcher subjectivity and ‘translation’ is involved, the researcher has used the following terms with some degree of overlap at the margins:

  • Few – 5% or less of comments
  • A number – 5% to 15% of comments
  • A significant number – 15% to 25% of comments
  • Many – More than 25% of comments
  • A majority – More than 50% of comments
  • Most – More than 60% of comments


Outside of these terms the researcher has provided clear absolute percentage counts where the numbers are at the extremes.


5.3.2.2 The Virtual Learning Experience: Both Groups

The two most used words to describe their experience were ‘fun’ and ‘interesting’. The frequency and strength of these positive comments surprised the researcher, representing over 60% of the participants.


The virtual world seemed to offer participants with a fun way to learn with the convenience of learning on line in their own time but further, at least as the experimental campus and lecture rooms were constructed in this experiment, offering a participant with a sense of presence that provided them with the perception of a similar experience to that of learning in a real world learning environment. Seeing others in the environment while attending a lecture as their avatar in a simulated theatre, gave the participant more of a connection to the learning process than one might expect if they were doing an online purely HTML page based traditional distance education learning course. To the majority of participants this experience felt personal, and the atmosphere relaxed and many found that it offered a more pleasurable experience than the traditional learning method of attending a lecture class in the real world.


The environment seemed to promote a favourable attitude to learning. Not only did the majority of the participants say it was “fun” but a number commented that they felt inspired to learn more about the topic, wanted to ask further questions on the same, or seek for more details and a significant number expressed surprise that although they clearly had experience of the topic in real life, they had never really considered how exciting a bridge could be. Only one participant expressed a non favourable attitude to this form of learning and/or the topic.


Based on the comments, the average participant was clearly immersed in this aspect of virtual learning as reflected by many comments that expressed varying degrees of ownership over the experience – and even, in some cases, resentment of others or extraneous circumstances had interfered in their learning.


To many this was a new experience in a virtual world and although they initially saw the offer of ‘linden’ as an easy way to make fast money, by the end of their experience instead of thanking the researcher for the money they thanked the researcher for the learning experience. The content of some of comments expressed surprise that the game they had known before was no longer ‘just’ a game to them. Participation had opened the possibility for a whole new world of learning, inside and outside of Second Life.


The virtual learning campus provided the participant with a seamless way to learn. Many liked the staged approach reflected by the testing and learning process (necessary as part of the automated control regime for the experimental process) - finding it a novel approach to the learning experience. Going from room to room to complete the each stage in the learning process possibly made this more fun than an alternative virtual world learning approach utilising a single class room in which all stages of a process might occur. Not knowing where the teleports would lead them in the next stage of their journey provided an exploratory feel to the environment. For most participants they found the environment very easy to use and welcoming.


The format and the information provided in the slide presentation received, on the most part, positive feedback. The requirement for more control over the slide show to pause, forward and rewind came from both groups. Enabling user control like this in this experiment was not an option as control over the information delivery for both groups’ had to be placed under strict experimental conditions so that only one independent variable changed in the experiment – the presence or absence of the 3D models.


Even so, if this or a similar lecture was not under experimental conditions the researcher cannot help but question if this addition would have lessened the entire experience of the participant. Sharing in the learning process within a set time frame and the pressure of the quiz after completion may have also added to the positive experience felt by the participants. Possibly allowing the user to walk away with additional material may have assisted in providing the participant with the convenience to learn more than just the information presented. In addition to this a live lecturer as some participants would have like to have seen may have also satisfied the participant’s requirements for more controlled information.


Technology constraints certainly presented itself in this experiment with approximately 20% of the participants from both groups commenting upon a technology issues to varying degrees. The major problems related to network latency (lag) and audio dropouts. In a streamed world (such as Second Life) especially when there are many avatars in a SIM lag is a typical problem. Audio although not as bad or as frequent as visual lag does occasionally present a problem in Second Life. The audio stream occasionally is lost and the only way to fix the problem is to re-log the application. Both problems from participant’s comments did not seem to affect their learning experience, and for only 7-9% warranted rating as having an impact. In the experience of this researcher, the majority of lag class problems are in fact not network lag but recipient computer performance issues. The entire sim and the various lecture rooms were monitored continually during the experiment and true (network) lag was not observed on the researcher’s computers during the experiment, nor did the SIM performance statistics monitored during the period demonstrate any significant decrease in performance.


Approximately 5% people from both groups complained that some of the pictures were too small in the survey instrument thus potentially obscuring the details of the effected bridges displayed. This could have been a major constraint on a participant’s ability to answer the Bloom’s cognitive process of ‘understand’ questions more than the ‘remember’ questions, and therefore may have contributed to perceptions of difficulty in Bloom’s ‘understand’ cognitive process portion of the post-quiz.


5.3.2.3 The Participants: Differences Between Groups

Whilst the 3D participants were presented with 3D models to aid learning, a number still seemed to be reading the slide show presentation. This effectively provided the 3D participants with 4 channels of learning; slide show pictures, slide show text, audio and models, whereas the 2D participants only had 3 of these channels.


There were 24 slides 20 of which were learning slides provided within a 20 minute lecture session for both groups. This meant a participant had approximately one minute per slide where they were presented with something new. There were 11 3D models of 4 bridge types therefore a new model was presented to them approximately every 2 minutes. Combining the models with the slides in the same time frame as the 2D participants may have disadvantaged the 3D participants.


The information content that was delivered to both groups was the same. No more or less technical or providing anything new with exception to the 3D models for the 3D group. Yet from the 3D groups’ comments some participants seemed to want more information or simpler explanations. Within the 2D group many had commented that it was easy to follow not too technical and easy to comprehend – none commented that the material was complex. Possibly this difference is not that they needed more information but rather that with 4 information channels there was too much information provided in the time allocated for the 3D group. Alternatively the difference might also reflect as case of ‘not knowing what you don’t know’ in the 2D group, while the addition of accurately constructed 3D models raised additional questions in the minds of the participants, or improved their general level of attentiveness.


The 3D group found the addition of 3D models to be a useful learning tool. From their comments it seemed that 3D models of the bridges were perceived to have helped them understand the subject matter better than they perceived they would with a lecture without the models. (Note, however that in this case the perception is not supported by the test results). Many participants perceived that the 3D models also made the entire lecture experience more engaging than whatever assumed alternative against which they were measuring the experience.


The focus of the 3D participants was more strongly inside the world rather than their outside world. Furthermore the extent to which their focus inside the world provided distraction brought about more emotional response than distractions noted by from the 2D participants. The former tended to use repetition, descriptive adjectives and emphatic declamations concerning distractions, while the latter tended to merely note or comment favourably about the ability to be distracted! This seems to suggest that the 3D participants experienced a greater feeling of presence and possibly immersion in their virtual world learning experience.


To appreciate these comments, the reader is referred to the literature review where the difference between immersion and presence is discussed (see page 39). Immersion or ‘system immersion’ is an objective measure it is the extent to which a person becomes removed from their outside world to operate within the virtual world space. Whereas, presence is a subjective measure it is the extent to which a person feels connected inside the virtual world or the feeling of ‘being there’ and their ‘willingness to suspend disbelief’ they are a part of, and inside, the virtual world.


The classification model presented by Benford (see Figure 9. Shared Space Technology According to Artificiality and Transportation) virtual reality environments are placed on a scale of artificiality and transportation. The degree to which a participant becomes removed from their local space to operate in a remote space is transportation that from Benford model is purely based upon the physical aspect of the virtual environment.


In this study the strong difference in the emotion and terms consistently used by participants in the 2D versus 3D lectures seemed to suggest that given the same virtual reality technology (desktop CVE) a greater transportation occurred for the 3D participants. The 3D participants become removed from their local world distractions and were transported into the virtual remote world. Thus in turn lead to a higher degree of presence within the virtual environment. The 2D comments of distraction compares with the results obtained by Martinez, Martinez, & Warkentin (2007) reviewed in Chapter Two Literature Review. They found when participants were presented with a 2D lecture in world participants reported distractions or a ‘disconnect’ from the lecture in world (see p. 86).


The degree of presence in the environment is often linked with desktop virtual worlds based around social interaction. As discussed the literature review Schroeder defines presence in terms of presence, copresence and connected presence (see Figure 10) which can be described respectively as ‘being there’, ‘being there together’ and ‘being connected together’. As discussed in the literature review for a social virtual world the level of presence is greater than a game virtual world due to the social connective aspects that occur within the virtual world. Heeter also defines that the presence of an individual is increased when social relationships are formed within the environment. Whereas, in this study, both groups where given the same social interactive aspects but it seems that the introduction of 3D models produced a higher level of presence for the 3D participant. The 3D participants clearly displayed more ‘ownership’ over their learning experience than the 2D group.


Of interest this higher level of engagement by the 3D group carried over to the volume of survey responses. The 3D group provided more descriptive and richer comments than the 2D group. Rather than a short dot points as often used by 2D participants, the 3D participants tended to use sentences in their open comments. The researcher was left with the subjective impression that the 3D participants, as a group, were motivated to greater detail and consideration of their comments, than was typical of the 2D group. Although not specifically measured, it is possible that the 3D group were still engaged with the experience even after they had left the lecture environment.


A further noticeable difference between the two groups was their relative concept of time. The 2D group made more comments that the slide show was a bit slow, whereas the 3D group made more comments that the lecture was too fast. (Note the actual timing and content were identical). This differing perception of time is most likely is due to a combination to the extra channel of information delivered to the 3D participants (being the 3D models) that had to be absorbed in the same time span as the 2D participants and the higher level of engagement the 3D participants expressed about their learning experience. One cannot rule out the effects of a possible unmeasured elevation of participant stress from the more “intense” learning experience vectored on the addition of the extra information channel.


5.4 Discussion of Results

This research sought to find the difference in learning outcomes of participants that were presented with two different forms of delivery methods; a 2D slide show and the same 2D slide show augmented with 3D models and simulations.


For the quantitative analysis the level of learning outcomes was the difference in the measure of achievement scores between the 2D group and 3D group.


Did they learn more after being presented with a 2D slide show or a 3D simulation model? From the results of both groups there was a slight, not statistically significant, lean towards the 3D group’s results on the total post-quiz scores. When analysed within each of Bloom’s cognitive process of ‘remember’ and ‘understand’, the 3D group performed slightly better than the 2D group (most notably at the upper score ranges) in the ‘remember’ dimension but there was no appreciable difference in the ‘understand’ dimension. The subjective interpretation might be that, with respect to the ‘remember’ outcome, the 3D approach may assist ‘stronger’ students to do better than they would otherwise do under the 2D approach, but that there was little impact on the ‘average’ student. The study did measure the ‘instantaneous’ ‘remember’ outcome, not the ‘remember’ outcome over an extended period, which might reveal greater differences.


Regardless of any anecdotal differences that may have been found, and the foregoing comments, the results of the statistical analysis of the post-quiz score across both groups revealed no statistically significant difference between the two groups learning outcomes within the confines of this experimental model. Thus the hypotheses defined for the quantitative analyses for this experiment remains unconfirmed.


Learning outcomes for a student traditionally are measured by a student’s achievement scores. Although an important measure, this does not provide any insight to the learning experience of the student. A student that obtains a high learning outcome by quantitative methods is not a measure of success from a qualitative approach. Quantitative methods focus on outcomes, qualitative methods focus upon the journey that leads the student to their end results.


While both the 2D and 3D groups were strongly positive of the learning experience, the qualitative analysis of both groups’ open comments revealed noticeable differences between the two groups’ journey to their end results. The 3D group tended towards greater ‘ownership’ of their learning experience, and while the 2D tended to merely observe (in some cases as a benefit) with the opportunity for distraction, the 3D almost universally, expressed resentment, or even anger, about the same distractions.


The experimental constraint of ‘same time’ may have adversely impacted the 3D group’s scored outcome due to the delivery of an additional information channel over the same time frame – even though at least 2 of the channels were effectively redundant. As the two groups performed the same and if anything the 3D group did slightly better, such a conclusion is by no means certain. The affect may rather have been to induce greater involvement by raising the stress factor for the 3D group and force greater participation in order to ‘keep up’ with the information flow.


The presence of the 3D models was widely perceived by the participants to enhance their understanding of the subject matter – although the scoring suggests that they assisted with remembering rather than understanding.


From the literature review of previous research it was found virtual world learning does take longer than traditional methods (Arreguin, 2007; Joseph, 2007). In this lecture we provided 20 minutes to both groups for a post-quiz of 20 questions. Although the 2D participants did not display a problem with the time allocated to the lecture from their comments given the results of the post-quiz particularly with Bloom’s ‘understand’, possibly both groups needed more time in which to understand the material, and particularly the 3D group where they were presented with an extra channel of information which could be interactively explored by which to learn.


Of the Likert scale questions 28 and 29 showed the most variation across the participants. The questions were specific to a participant’s learning experience. Question 28 asked if they found the learning experience better than their usual methods of learning. The vast majority from both groups agreed.


When asked in the Likert scale if the information provided was enough to understand the topic the 2D group was slightly more satisfied than the 3D group. The open questions shed some light on this issue, with more 3D group participants expressing a desire for more time to assimilate what was provided and more opportunity for self driven information collection, questioning and investigation – rather than merely more information per se. This difference might also reflect the greater level of participation, immersion, presence or transportation evidenced in the 3D group.


5.5 Conclusion

In answering the research question: How effective is it to learn in a virtual world using a traditional 2D slide show method compared to that of a 3D interactive simulation? The conclusions from this research are clear, and not necessarily as expected by the researcher at the commencement of the study:


  1. Transportation of a 2D real world lecture presentation into a virtual world situation is an acceptable use of the virtual world technology producing no statistically different outcome for Bloom’s ‘remember’ and ‘understand’ and combined cognitive processes at the mean, although there are some indicators that the ‘remember’ outcome might be enhanced at the upper and lower deciles of participant ability through augmentation of the 2D presentation with 3D representation and simulation.
  2. Adoption of 3D visual aids is not a pre-requisite for successful learning in a virtual space.
  3. The presence of 3D visual aids assisted participant’s perceptions of enjoyment, engagement, presence, immersion and/or transportation, and may therefore have a longer term effect on participation rates where participation in learning is purely voluntary.


Projecting these conclusions into a practical teaching scenario, where outcomes are the same, and only instantaneous outcome measures are considered (as the researcher did not examine long term outcomes) after taking account of the input costs of material preparation, it is clearly more cost effective to use the 2D presentation strategy for delivering virtual world courses. This conclusion is sustained where cost is measured in terms of time required for input preparation regardless of the sourcing (i.e. where the 3D models are acquired for no input hours and no financial cost the cost measure would void the observation), and outcomes are measured in terms of test scored results taken within a short period of the learning.


Where the outcome measure includes participant perception of the experience, the 3D augmented learning approach is indicated, but in this scenario, grading the relative ‘worth’ of the greater experiential outcome is more difficult and it is less clear how it can be factored absolutely into a cost benefit analysis.


5.6 Opportunities for Further Research

Experimental research as the name suggests is applying scientific methods and analysis to learn new insights so that other researchers can pick up from the experiment to reproduce, reform and critique. In this section the researcher proposes some opportunities for further research based upon the analysis of the results discovered in this research.


5.6.1 Improving Instrument Reliability

One limitation that is difficult to avoid was found in the analysis of the instrument reliability using formal (statistical) instrument reliability testing. Essentially in this experiment there were too few questions within each of the two Bloom’s cognitive process test sets to provide a conclusive reliability measure of the instrument. Increasing the number of questions within each group would certainly provide more data points in which to measure achievement results, and as a consequence of how the reliability measure algorithm works, would improve instrument reliability. The first obvious problem faced with the pre-quiz and post-quiz design for this type of experiment is that, as the number of test questions (data points) is increased, there is a point at which the testing might materially affect the training experience and therefore the outcomes, as the participants would eventually start learning from the quiz questions.


If the number of question were to be increase the range of information presented to the participant would also have to increase. Increasing the range of information provided would require additional time to be allocated to the lecture and possibly each topic therein. There is a point at which the length of time required to complete the lecture and quiz / survey combined would affect the quality of the results as the voluntary participants might judge the exercise was taking too much time, and rush the final testing / survey stages.


5.6.2 Course versus Lecture

The experiment focussed on a single lecture, measuring the affordances over a sequence of lectures using a similar experimental model would provide additional depth of analysis and would neutralise any initial ‘wow’ factor that might have influenced participation and attentiveness in this single event based experiment. It is possible that differences in outcomes might be more apparent between the two groups if a course was involved rather than a single lecture. There are other factors that might influence such an experiment design – such as motivation for attending the course in the first place.


5.6.3 Introducing a Real and Robot Presenter to the Experience

The 3D group displayed a higher level of presence in this research study. The contributing factor in this observed difference between the two groups was prima-facie, the 3D models. The opportunity for further research lies in the introduction of a presenter (even an automated robot presenter) into the lecture experience to see if this increased the level of presence had by the 3D group would occur for both groups given a live or virtually-live lecturer. As presence is generally shown to be increased by relationships with other people within a virtual world the introduction of a lecturer may add further insight as to why the 3D group displayed a higher level of presence given they only had the addition of 3D models.


5.6.4 Testing Other Bloom’s Cognitive Processes

The 3D group seemed to believe that the models contributed to their understanding of the subject matter. Testing higher levels of Bloom’s cognitive processes such as Apply, Analysis, Evaluate and Create may reveal that this increase in understanding may present differences between the two groups for the higher levels of Bloom’s cognitive processes.


5.6.5 Outcome Measurement Over Time

In this experiment the post-quiz was given directly after the lecture. Re-testing the participants over a number of periods to assess which group retained the information better for longer, and the extent to which the two approaches impacted understanding outcomes over time. The experiment would probably require a vastly greater number of initial participants so that each time lagged testing group could be tested once at different intervals, rather than re-tested, so that the testing itself did not colour the results. The researcher suspects that the greater level of post lecture engagement demonstrated by 3D participants might result in both slower degradation of the ‘remember’ outcome and a post lecture improvement in the ‘understand’ outcome over time.


5.6.6 Comparison to Real-World Training

Perhaps the most obvious inquiry that presents itself for further research is to include another experimental group. As the virtual world 2D lecture was effectively a real world lecture delivered in a virtual world, the addition of a real world participant group operating under the same constraints as the virtual world groups would provide an interesting control reference for virtual-real world comparison of outcomes. Providing the 2D presentation to real life participants may provide further insight to the differences of the virtual learning experience in addition providing a control group that was based around more traditional learning methods.




BackLinks