Evaluating SFP’s Outcomes Effectiveness

Evaluation of the effectiveness of the Strengthening Families Program (SFP) is frequently required by funders, and positive family outcomes can be useful in grant applications. Results can also guide agencies in improving their implementation of SFP.

SFP Analytics, under the direction of Karol L. Kumpfer, Ph.D, developer of SFP, offers affordable evaluation services using the SFP Pre-Post Survey, using questions derived from well-established, and psychometrically sound assessments. These have been refined for over 25 years of use to reliably measure parent and child outcomes. The survey instrument is designed to assess parenting styles, child and parent mental health, substance abuse risk and resiliencies, family management and cohesiveness, and parent and child social skills and attitudes.

To reduce reported “test fatigue,” and increase the likelihood that parents and youth would complete their SFP surveys accurately as we migrated to online survey use, SFP Analytics recently had a statistical analysis performed on SFP Pre-Post Survey outcome data to see if the 108-question survey could be reduced, and still measure outcomes and effect sizes accurately.

A University Statistics and Epidemiology Professor ran a Factor Analysis on 1500+ completed surveys (part of the 6000 Group Norms data base), using ESEM techniques to see whether shorter scales can be used. He then ran Item Response Theory (IRT) models on the data to show how each “item” (question) performs as a measure of the underlying latent “trait” (like “parent involvement” or “family conflict.”) IRT shows how well an item does as a measure of the scale. We also had a statistical analysis of the questions to see which questions were most sensitive to change.

The statistical analysis of the Strengthening Families Program (SFP) Pre-Post Survey outcome data showed which of the 108 questions were essential to get a valid assessment of change; and the SFP outcome survey was reduced from 108 questions to 54 questions, 10 of which were demographic questions. This shorter survey has been used by dozens of families from multiple agencies, with excellent results.

Data Collection Methods

SFP Family Coaches administer a paper survey, or direct the family on how to take the survey online, at the end of the last class. Online is our preferred method of data collection. Answers to the survey questions can now be entered by the families on a computer, tablet, or smartphone, and the software that provides the analysis is provided by contractors independent of SFP Analytics.

Currently, we are using Remark Statistical Analysis software for the survey analysis. In March, we will begin using Qualtrics, a nationally-recognized survey company, as our survey management company. Confidential responses can be entered by parents and youth online via a computer, tablet, or Smartphone. (A paper survey can also be administered to parents and youth when necessary, for a small extra cost.) All survey data will be collected and analyzed electronically by Qualtrics. They will provide outcome results to SFP Analytics, who will prepare a report for the participating agency, and include the Qualtrics data and stats.

Survey Instrument

To improve outcome validity, a multi-informant assessment strategy is used for the outcome evaluation with instruments measuring: 1) parent pre-post responses, and 2) child pre-post responses. For in-home SFP family instruction, a therapist/SFP Family Coach report can be obtained and added for a small extra fee.

SFP Analytics uses a Retrospective Pre-Post Survey because research shows more accurate outcomes measures are obtained with a Retrospective survey. This is because many parents, before they know and trust their SFP Family Coach, are afraid to admit weaknesses in a Pre-test because they don’t want to look bad, or get into trouble with local Family Services Agencies. But by the end of the 10-to-14 weekly classes, where they have improved and learned new skills and can show their progress, most are willing to admit where their parenting skills really were before the classes began. Having learned new SFP skills, they also have a better bench-mark to accurately measure their previous parenting skills against.

The questions in the SFP survey are abridged versions of psychometrically sound assessments. Estimates of internal consistency (presented below) are based on a Group Norms sample of 6000 families who have taken SFP classes, with the exception of the covert aggression scale, which is based on a larger study conducted in Ireland.

Five multi-item scales assess parenting-related skills, including Positive Parenting (e.g., “I praise my child when he/she behaves well”: α = .79); Parental Involvement (e.g., “I talk to my youth about his or her plans for the next day or week”: α = .75); Parenting Skills (e.g., “I use physical punishment when my child will not do what I ask”: α = .64); Parental Supervision (e.g., “I know where my child is and who he/she is with”: α = .70); and Parenting Efficacy (e.g., “I handle stress well”: α = .75).

The Parenting Skills, Parental Supervision, and Positive Parenting scales were taken from the Kumpfer SFP Skills instrument, and Parental Involvement items were taken from the Alabama Parenting Scale. In addition, four abridged scales were taken from the Moos Family Environment Scale to assess Family Cohesion (e.g., “I enjoy spending time with my child”: α = .75), Family Communication (e.g., “We hold a family meeting weekly”: α = .69), Family Conflict (e.g., “Our family argues a lot with each other”: α = .87), and Family Organization (e.g., “We go over schedules, chores, and rules to get better organized”: α = .71).

Items assessing cognitive, affective, and behavioral facets of depression were taken from a survey instrument used to evaluate the Good Behavior Game, a school-based intervention to reduce aggression, delinquency, and drug use. The items were originally culled from the Child Depression Inventory and the Child Behavior Checklist (CBCL). Parents rated their children’s mood and emotional tone with six items (e.g., “My child looks sad or down”: α = .64), making sure to simplify the wording for those with lower reading ability. A six-item scale was used to assess covert aggression (e.g. skipping school or breaking the rules: α = .69), and separately, another six-item scale assessed overt aggression (e.g., hitting or fighting: α = .75).

All scales were adapted from the Parent Observation of Children’s Adaptation (POCA). The POCA assesses how the child conforms to the family social world (i.e., their aggressive, disruptive behavior) and is a modification of the Teacher Observation of Classroom Adaptation-Revised assessing a child’s performance on core classroom tasks (accepting authority, social participation, and concentration) and their adaptational social status. The teacher rating instrument was originally developed as part of the Woodlawn, Chicago, early behavior management intervention study and then later used in evaluating the Good Behavior Game intervention. Recent psychometric evidence confirms the reliability of shortened scales from the APS.

Parent and youth can choose their responses from a five-point scale. As an example, positively worded survey items (e.g., I praise my child when he/she behaves well) with a response option of 1 (Never), a 2 (Rarely), 3 (Sometimes), 4 (Often), and 5 (Most of the time).

Parents are also asked to evaluate their child’s past month use of alcohol, cigarettes, drugs both before (e.g. “In the 30-days before the SFP class, how many times do you think your child used the following”) and again after the SFP lessons.

Cost for Evaluation Services

There is a fee of $350 fee to use the surveys and receive an evaluation report. This fee covers up to 20 families. We charge $50 more for each set of 20 additional families. The surveys are in Spanish and English for both Parents and Youth. We can send you sample pages of the Parent and Youth Pre-Post Surveys if needed.

A sample of the Parent and Youth survey instruments including references to publications supporting the rationale of the Combined Retrospective Parent Pre- and Post-test may be obtained by emailing: (preferred) or contacting Jaynie Brown at 385.226.3396.


A total change score is calculated as well as summed scores for the parent and child outcomes. Effect sizes of the outcomes are calculated using both a partial eta squared or Cohen’s (d) and the d’ statistics for the cluster variables and 16 individual outcome variables related to parent, family, and child risk factor improvements and improved protective factors. Analyses of Variance (ANOVAs) and the Effect Sizes for the pre- to post-test changes are conducted and reported in outcome tables categorically by parent and child variables.

The pre and post outcome results are normed against a data base of over 6,000 families in the United States who have taken SFP classes in the past, (and against a control group if provided by the agency requesting the evaluation).

  • Subscales measure the hypothesized outcomes for SFP, namely: Family Relationships, including bonding, cohesion, communication, organization, and family conflict.
    Parenting, including parenting style, discipline, monitoring, and parenting self-efficacy
  • Parent’s depression
  • Children’s social skills and resiliency
  • Children’s overt and covert aggression, depression, and conduct disorders
  • Association with using or anti-social peers
  • Children’s and parents’ tobacco, alcohol, and other drug use, and attitudes towards youth use

SFP Complex Research Measures​

For complex research grants, more complex measures can also be used as listed below by informant and by construct. The dependent variables or latent constructs are ordered from the most proximal (parent and child alcohol and drug use) to the most distal (family and school environment) as predicted in the Social Ecology Model to be tested.

Table 1: Instruments by Informant Source by Construct

Parent Alcohol and Drug Use

  • Parent 30-day Alcohol and Drug Use (GPRA) 11-items
  • Parent Attitude Towards Adult Drug Use (GPRA) 3-items
  • Parent Attitude Towards Risk (GPRA/Household Survey) 5-items
  • Parent Thrill Seeking (Household Survey) 4-items
  • Family History of AOD Problems (CSAP Core) 1-item

Child Alcohol and Drug Use

  • Parent Attitude Towards Child Drug Use (Arthur) 3-items
  • Child 30-day Drug and Alcohol Use (GPRA) 11-items
  • Child or Parent Depression/Self Esteem or Self Concept
  • Child Depression Scale (Kellam POCA) 3-items
  • Parent Depression: (Mod. Beck) 11-items

Peer Influence

  • Susceptibility to Peer Pressure
  • Social Support for Non-drug Use
  • Peer Alcohol Use (Jessor & Jessor, 1977)

Academic Competency

  • School Report Cards (grades)

School Bonding

  • BASC: Attitude toward
  • Teachers and School
  • Report Cards: Attendance, Tardy

Social Skills

  • BASC-Parent Rating
  • BASC Teacher Rating Scale
  • BASC-Child Rating Scale
    (Reynolds & Kamphaus, 1992)
  • Leadership/Social Skills
  • What About You (Gresham & Elliott)

Conduct Disorders/Self Regulation

  • Parent Observation of Children’s Activities (Kellam) TOCA
  • (POCA–anti-social and aggression scales 40-items)
  • Thrill Seeking (Household Survey)

Parenting Skills

  • Parent Child Affective Quality (Spoth & Redmond) 7-1tems
  • Family Attachment (Hawkins, CTC)
  • Family Management (Parenting) Scale (Arthur), 8-items
  • Parental Monitoring (Arthur) 3-items
  • Household Survey
  • Parent/Child Time Together, (Tolan) 4-items
  • Opportunities for Pro-social Involvement (Kumpfer/Arthur) 4-items
  • Rewards for Pro-social Involvement (Arthur) 2-items
  • Discipline Style (Alabama Parenting) 10-items

Family Environment

  • Family Conflict Scale (Hawkins, 3-items)
  • Family Cohesion Scale (Moos, 9-items)
  • Family Organization Scale (Moos, 7-items)
  • Family Mobility (HHS)

Total 169 Questions or Items

Most of these measures are Cross-site Family Core Measures selected by expert teams as the best measures having high reliability and change sensitivity. By selecting SAMHSA GPRA and Core Measures, we are able to compare our baseline data to other sites as well as the effectiveness of the outcomes. Scales that match were selected for comparability across source of data.

Retrospective Pre- and Post-tests with Triangulation across Parents, Youth, and Trainers.

Recently some SFP sites have been finding negative effects on sensitive questions such asdrug use and severe discipline from clients who do not trust the agency staff to not report them to authorities. Hence, on the pre-test they saythey are ‘perfect parents” and their children are “perfect kids” with no problems. The children’s group leaders do not observe the children to be “perfect” children. Then on the post-tests the parents now trust the staff more and report accurately their problems. When the data is analyzed, these people look like they have gotten worse, when, in fact, they are much better. To check for positive biases on the pre-test due to lack of trust in the confidentiality of the data (found more in disenfranchised youth and families such as poor, stigmatized, and some immigrant families), a short retrospective pre-testand post-test could also be given to the parents, child, and trainers. In this procedure, developed with school-based studies of drug-abusing adolescents by Rhodes & Jason (1988), the youth are asked about their baseline (pre-test) drug use again at the post-test. This retrospective pre-test data is then correlated with the actual pretest data to determine the amount of potential bias in the pre-test.

Data Analysis​

Means, standard deviations, and change scores are calculated for each question as well and the sub-scales. Missing data is calculated using missing data multiple imputation programs. When two adults complete the parent interview items concerning the target child, inter-rater reliabilities are calculated and decisions made as to whether to average both scores or only use the mother’s self-reports frequently found more valid (Fitzgerald, Zucker, Maguin, & Reider, 1994). Chronbach’ s alpha reliabilities are calculated. Valid self-report data can be problematic with children younger than 9 years of age. Scales with low reliability will not be used; hence, some of the data for the 8-9 year olds may not be used in the final data analysis Since not all child data will be used, the parents’ and therapist/trainers’ reports on the children are very important data sources as are the archival school data.

Statistical significance is calculated by comparing the changes in the families participating in SFP with the comparison group, could be any existing parenting services or families who are not receiving any parenting services. If no comparison group, then just compare the pre- to the post-test paired means. Never include subjects who have dropped out in the analysis as they can bias the data. These tests calculated using standard SPSS software, first conducting analysis of variance or co-variance to determine if there are any significant interactions in the data as determined by the F-values. If there are significant F-values, then matching mean differences can be tested using t-tests, with one-tail tests for hypothesized directions of effect. The effect sizes should also then be calculated for each major scale to determine how large was the statistically significant effect.

Family Qualitative Outcome Data​

While these are the best measures found by the CSAP Core Measures Expert Panel, it is not known how culturally-valid are these SAMHSA GPRA and Core Measures are for the various ethnic groups that could be participating in SFP studies. Following a strict protocol, qualitative data could be collected by the evaluation staff at baseline (pre-test and needs assessment) and post-test, as well as at the annual surveys. The transcriptions of the interviews would then be analyzed by an ethnographic software program (Nudist) looking for emerging themes in risk and protective factors and how they change after the interventions. In addition, categorically coded data could be entered into a computer from the structured and semi-structured parts of the interview protocol. The client participants and stakeholders in the Project Advisory Committee could structure the interview questions. Some ethnic clients relate better to being asked to tell their story about their changes than to rate on a five point scale their improvements.

Staffing the Evaluation

The data is generally collected by the group leaders and site coordinator who collect the data at the SFP sessions. It is best for them to collect the data because the families get to know and trust them.